Diving into Java (with strong JavaScript and Python background) Part 3(Arrays, Lists, and Objects)
I’m trekking through this adventure of learning Java and becoming proficient. Once again, my game plan for this task is:
- √ Load most common environment
- √ Research naming conventions
- √ Hello World app to confirm environment set up properly
- √ Review and practice basic data types and available methods in Java
- Review and practice equivalents to arrays/lists and objects/dictionaries in Java
- Algorithm practice
- Sample projects with the above knowledge
- Learn Spring framework
- Sample projects with Spring with REST API and WebSockets or Socket.IO
In the last article, I explored the primitive data types of Java. The key thing about the primitive data types is each variable is stored in one place in memory with one value, serving a single purpose (no sub-purposes). Building an applications that store a list of tasks, or maybe a list of addresses attached to a person’s name and perhaps phone number would be an extremely difficult task and nightmarish to maintain:
Imagine how many lines of code it would take if you wanted to allow 25 tasks, 50 tasks, or 1000 tasks. And this is to just manage the information for the tasks in a set order. Wouldn’t it be nice to able to store the entire list within one variable? This is where data types like Arrays, Lists, and Objects save the day!
This is much cleaner. I can make the task list as long as I want (at least as long as there is room in my computer’s memory) without major changes to the code. The above example is just tracking one thing per item. For an address collection, imagine trying to do person1_name
, person1_address
, person1_phone
, person2_name
, etc. Not fun. We’ll explore the solution to that scenario towards the end of this article. Let’s jump into Arrays in Java!
Arrays
Probably the hardest thing to keep straight while switching back and forth between different programming language is the names each language uses for a construct. In the above example, Python would call the variable tasks
a list, while JavaScript would call it an array. Python does not have a data type called an array, so transitioning terminology between those two languages is somewhat easy. Java, on the other hand, has a construct called an array, as well as one called a list. Java arrays function quite a bit differently than a JavaScript array. Java’s list is more inline with the functionality described above. Java’s array does not have an equivalent (in terms of functionality and low-level management) in Python and JavaScript, as far as I’m aware.
Arrays in Java are a group of items of the same data type AND are “physically” stored next to each other within memory with a FIXED size. Let’s say you sold your house and bought another house, but there’s a gap between when you have to move out of your old house and when you can move in to the new one. In the mean time, you decide to rent multiple storage units that can be dedicated to various rooms in your house. You were able to rent 5 storage units that are next to each other. So you have an array of storage units, each containing room objects. A week later, you discovered you forgot to rent a room for the items in your shed. You go to rent the next storage unit down from the rest of your storage units to discover someone else has already rented it, so you decide to to rent 6 units on the other side and move the contents of the first 5 there so that everything is together for when you are ready to move to the new house.
When working with arrays in Java, once you set the array size, it is set permanently. You can either set the size of the array first then populate values, or you can supply a set of initial values. You can even nest arrays inside of arrays, making a multi-dimensional array (side note: the inner arrays won’t necessarily be grouped together, the outer array will contain references about where each of the inner arrays are in memory). If you needed to add another item (technical term element) to an array, or remove one, a brand new array has to be created and placed into memory. So what makes this useful? If you have a scenario of a fixed length, you have extremely fast access to each of those elements. Back to the storage unit analogy. If you have all your storage units are grouped together, you just have to memorize on location to get access to all those units. Know the location of one, know the location of all.
It also requires less memory than other data structures that store a list of values. Each value stored takes up one spot in memory in an array. In other constructs for tracking a list of items, in addition storing a value for each element, you ALSO need to store where to find the next value (or where to find multiple other values in the construct). With the storage units example, let’s say instead you were unable to get your storage units next to each other. You decide it would be easiest to only memorize the location of the first storage unit. You decide to write the location to the second storage unit and leave it in the first, the location of the third and leave it in the second, and so on. The advantage of this approach is whenever a new item is needed, it can be placed anywhere in memory without having to shift the existing elements around other than updating the reference in the second to last element, at the time cost of having to travel all the way down a list to find the item you need. Imagine having 100 storage units and trying to get an item out of the last one. The more items you have, the longer it can take to access the last item (worst case scenario on getting an item).
Fortunately, in most programming languages I use (or on my list of languages to learn), this particular piece of management is handled by them behind the scenes, but having an understanding of how data types can be implemented at the lower level can influence how to write code for optimal performance.
Java ArrayLists, Python Lists, and JavaScript Arrays
So what happens behind the scenes for the similar data structure between the three languages? Let’s say instead of storing the location to the next storage unit, we instead created a master list of just the addresses:
Great! I can now go to any of my storage units directly! It doesn’t matter how big my list gets, I have direct access to go to any unit at will or having to stop at other units along the way to access the target unit. I have no idea about what any of the units are from looking at my list, but that’s okay. I know that each of the units store something that I will move to the new house. I also know when I built this list, a put some essential items and non-essential items in each of the units. When I finally move to the house, I make one pass to gather only the essential items from each of the units, then another pass later for the rest. Arrays are created so you can perform common operations among all the elements. By creating a secondary list that holds all the locations to the units, the number of units you can have can grow indefinitely. This is how lists in Python, arrays in JavaScript (in most circumstances, array of numbers works a bit differently), and ArrayLists in Java work under the hood. Let’s look at some code:
Some basic features that are available through this data type is the ability to quickly access any item within it, whether to read the value or set a new value, add or remove items, and most importantly, loop or iterate through every item and perform common operations on every element (in this case multiply each integer by 4 and print the result).
Objects, Dictionaries and Hashtables (and Classes)
So what if I wanted to take my list of storage units and have an idea about what I may expect to find at each location? Maybe I only want to go get all the items from the storage unit with all my Master Bedroom items on the first day I move in so I can sleep and gather energy to get the rest of the stuff? Instead of just keeping a list of just the locations of each storage unit, I can add a label next to each location so I have an idea about what to expect in each unit:
JavaScript objects, Python dictionaries, and Java hashtables model this. Behind the scenes, a key is run through a function to give it a unique index, allowing us to find the referenced value quickly in memory. Let’s say I wanted to model a person using code, capturing first name, last name, and address:
Notice how now I can now describe multiple descriptive aspects, or properties, all under a single variable? This is the beauty of objects in keeping code clean. The above example isn’t exactly best practice in making reusable code. A mechanism the exists that allows us to describe how a person should be modeled and make new instances based on that description is a class. Instances of a class is also called an object:
Notice how much cleaner it is to create a person object? Sure some of the setup makes it look like it takes a lot, but now we have the ability to produce countless versions of the same thing with minimal effort! Python did require an extra method in each class to print out in an object-like format. I needed to rely on an external library in Java to produce that format as well (some mild searching across different sites shows this can be a very complex process and there are no libraries in the standard Java API to handle this).
Notice how nothing special was needed in JavaScript? That’s because everything in JavaScript is actually an object! The syntax and keywords just tricks us into thinking we are using a class. Under the hood, we find that classes are just syntactic sugar that takes our class syntax and converts it into a function that makes an object that contains properties and methods. There is quite a bit of syntactic sugar in modern JavaScript, meaning that the same things happen behind the scenes in terms of low-level data structures and execution, it just that the code we actually write looks cleaner to us!
With Java, you do have far more control and power in choosing the specific data structures to perform what operations the fastest. As an example, in sorting algorithms, you’re given a choice between choosing an Array vs ArrayLists in Java. In Python, your only option would be the equivalent of ArrayList. In JavaScript, at least as implemented in Chrome V8 and Node.js, it will either be an consecutive position array like Java’s Array (if sorting by numbers), or it will be like Java’s ArrayList (if sorting by any other data type) — you do not get to make that choice yourself.
Java has far more data structures to choose from than what I’ve presented in this article. I’ll explore those as I come across a need for them. In the next article, I’ll explore implementing some basic algorithms in Java!