Python: Objects and Mutability

Written by stefansilverio | Published 2019/01/10
Tech Story Tags: programming | python | engineering-culture | computer-science | python3

TLDRvia the TL;DR App

In Python, everything is an object. This is mostly a side effect Guido Van Rossum’s (the creator of the Python programming language) design principle “first class everything.” First class everything means that everything is an instance of something else. More generally, it means that everything is on the same “level” as everything else. Take a look at the following (everything is run in the python3 interpreter):

We used the type() method to see what class our objects belong to.

Even data types are objects of their respective classes. So if data types are just instances of classes, we can imagine that different data types have different attributes. For example, lists are mutable data types while tuples and built-in data types like int and float are immutable. Observe the following:

We use the id() method to examine the memory addresses of our objects

In the first frame, we create two identical strings. Since strings are immutable objects (you can’t change a string), Python will economize its memory usage by having both variables point to the same object. We can see this is true because string1 and string2 share the same memory address (we can use id() to examine the memory address of an object). However, in the second frame we attempt to change the value of string1 using “+= ‘e’.” Since strings are immutable, Python is forced to create a new string object which points to the string “cate.” Observe the same behavior with integers:

Although integers appear to act the same way, this is actually for a different reason. To economize memory usage, CPython pre-allocates (or binds) the first 262 integers on start up. This means that numbers -5 — 256 (inclusive) are automatically bound to certain addresses in memory. That’s why a and b reference the same location in memory in the example above. Observe the following:

Since 300 is not pre-loaded in memory, a and b point to different addresses

More technically, CPython has created macros called NSMALLPOSINTS and NSMALLNEGINTS. These macros refer to -5 and 256 respectively. CPython actually stores references to all of these integer objects in an array. When we “create” an integer in that range (Ex: a = 5), we’re actually just telling our variable to point to an address stored in that array. The array is set to this particular scope because these are the most commonly used numbers. Once you ask for a number outside this range, CPython will be forced to start finding new memory locations where it can store numbers.

Of course if we want a and b to point to the same object (even if they’re outside the range of our macros) we can pass one variables memory reference to the other:

Now a and b point to the same object

This last example demonstrates a mechanism known as “aliasing”. Since a and b now point to the same object, we can say that the object is aliased, or that a is an alias for b and vice versa.

Since integers are immutable objects, when we change the value of a in the second frame, we force Python to create a new object. These last two examples demonstrate a smart way to think about variables in Python. Variables in Python are essentially names we use to refer to objects. They are like labels. This is clear in the integer example above because even after we say “a = 4” (tell a to point to a completely different object), we are still using the name or “label”, to refer to the new object.

Now we know a bit about the way Python handles immutable objects. Let’s examine mutable objects.

Lists are mutable objects. Let’s initialize two lists:

Interestingly, Python does not try to economize and make list a and list b point to the same object even though they contain the same data. List a and list b are different objects. Since lists are mutable, Python knows that you’ll probably be adding or removing elements from these lists.

Now that we understand some important properties of mutable and immutable objects, let’s learn about some operators we can use to test the state of objects.

In the above example we use the “==” operator to test whether our variables point to an object containing the same data. Since both of our lists contain the same data, this evaluates to “True”.

In the above example we use the “is” operator to test whether our variables point to the same object. Since a and be point to different objects, this evaluates to “False.”

To get a visual, we can imagine our above scenario's via the following diagram. The visual on the left where a and b are pointing to separate objects containing the string “banana” represents the situation we’ve created above.

In order to achieve the visual on the right where a and b are pointing to the same object, we need to do the following:

Now we set both lists to point to the same object. We verified this in the second diagram where we printed the address that each variable points to. This example also touches upon the difference between assignment and referencing in Python. We can think of variables as names, or references, to specific locations in memory. These memory locations contain objects (values). The statement a = 5 for instance tells or “binds” the name “a” to point to a certain location in memory that holds the value 5. We can “rebind” a variable by telling it to reference a different location in memory - a = 6.

Now let’s take a brief look at tuples. Tuples are immutable data structures, however they can contain mutable objects like lists inside them. While you cannot append or add new elements to a tuple once it’s been created, you can change the mutable objects stored inside a tuple. Observe the following:

In this example we create a tuple (“dum”), with a list inside it. We then assign the list object stored in the tuple to a different variable “dee.” We then verify that the tuple object and the list object living inside the tuple live at different memory addresses. We then try to change the list stored inside the tuple. We verify that the list has been changed both in “dee” and “dum.” Tuples are unique instances of immutable objects that can contain mutable objects. You can change these mutable objects.

One final unique data structure in Python is the frozen set. Frozen sets are immutable versions of set objects. While elements can be added and removed from a regular Python set:

No elements can be added or removed from a frozen set:

So why does this really matter? Why should we care about how Python treats mutable and immutable objects?

First of all you should care because if you want to be a skilled programmer you should know exactly what’s going on with your variables at all times.

Secondly, it matters because the way Python passes objects to functions is affected by the mutability of those objects. If you pass immutable objects to a function, the passing acts like “call by value”. Observe the following:

In the first frame we define a function that increments and prints an integer variable. We tell the function to print object id’s for debugging purposes. In the second frame we call the function. The first id printed, matches the id of the integer object b printed outside the function. This tells us that object b was passed by reference to the function increment(). However, the second id - which we print after the increment - is different. Once we hit the increment step and realize that we must change the value of our object, Python creates a different integer object with the value from old object and increments that object. This is also why the value of the object in our function is 6, while the value of our object outside the function is 5.

We can do this same exercise with a mutable object:

We see from the first id statement that our object was passed by reference to our function just as last time. Yet unlike last time, the address printed by our second id method (which is executed after append()) corresponds to the same object. Since lists are mutable data structures, Python is able to just add an element to an existing list instead of creating an entirely new object. We can also see that the updated list persists outside of our function scope.

This is why people say that Python passes neither by value or reference, but instead by “call by object reference”.

In conclusion, mutable objects in Python include: list, dictionary, set, and byte array. Immutable objects include: int, float, complex, tuple, string, frozen set, bytes. Hopefully, you’re feeling more confident on the different ways Python handles mutable and immutable objects.


Published by HackerNoon on 2019/01/10