Dynamic Typing in Python
Exploring How Object References Work in Python
If you have a background in languages such as Java, C or C++ which are compiled or statically typed, you might find the way that Python works a bit confusing. For instance, when we assign a value to a variable (say a = 1 ), how the heck does Python know that variable a is an integer?
The Dynamic Typing model
In statically-typed languages the variables’ types are determined at compile-time. In most languages that support this static typing model, programmers must specify the type of each variable. For example, if you want to define an integer variable in Java, you will have to explicitly specify it in the definition.
# Java Example
int a = 1;
int b;
b = 0;
On the other hand, types in Python are determined during run-time as opposed to compile-time and thus programmers are not required to declare variables before using them in the code.
Technically, a variable (also known as a name in Python) is created when it is assigned a value for the first time in the code. This means that a variable must first be assigned a value (i.e. a reference to an object in memory) before it is referenced in the code otherwise an error will be reported. And as mentioned earlier, the variable assignment never comes with a type definition since the type is stored with objects and not with variables/names. Every time a variable is identified, it automatically gets replaced with the object in memory it references.
Relationship between objects, variables and references
To sum up, every time we assign variables Python undertakes the three following steps:
- Create an object in memory that holds the value
- If the variable name does not already exist in the namespace, go ahead and create it
- Assign the reference to the object (in memory) to the variable
A variable, is a symbolic name in a system table that holds links (i.e. references) to objects. In other words, references are pointers from variables to objects. In Python though, variables do not have a type. Therefore, it is possible to assign objects of different type to the same variable name, as shown below.
a = 1
a = 'Hello World'
a = False
In the first line, variable a is being assigned with the reference to the integer object with value 1 . Likewise, the second line changes the reference of variable a to a different object of type string while the last line changes the reference so that a now points to a boolean object.
When we refer to objects we actually mean a piece of allocated memory that is capable of representing the value we wish. This value can be an integer, a string or of any other type. Apart from the value, objects also come with a couple of header fields. These fields include the type of the object as well as its reference counter which is used by the Garbage Collector to determine whether it is fine to reclaim the memory of unused objects. And since Python objects are capable of knowing their own type, variables don’t have to remember this piece of information.
Shared references
In Python, it is possible for multiple variables to reference the same object. This behaviour is called a shared reference. For example, consider the code below
a = 1
b = a
Initially, Python creates an integer object with value 1 and it then creates the variable with name a and finally the reference that points to the integer object in memory is assigned to the variable.
In the second line is essentially where the shared references come into play. Python will now create variable b and will now hold a reference to object 1 which is the same as the one assigned to variable a . Note that variables a and b are completely independent to each other and they are not linked in any way. They are just sharing the same reference that points to the same integer object in the physical memory.
Now consider another example in which we have one additional operation:
a = 1
b = a
a = 'Hello World'
In this example, an additional string object with value 'Hello World' is created in the memory and its reference is assigned to variable a. And it is important to highlight that in this case, the value of variable b remains unchanged. The same behaviour is observed even with the same object type
a = 1
b = a
a = a - 1
Again, the last statement will trigger the creation of a new integer object with value 0 and finally assign its reference to variable a while variable b remains unchanged (i.e. it keeps referencing object with value 1 ).
Shared references and Mutable vs Immutable objects
In Python, all variables are references (i.e. pointers) to a specific memory location that holds the object. However, the way that objects are created and modified is dependent to whether their type is mutable or immutable.
As we have seen in the previous example, the last assignment a = a - 1 won’t modify the object itself since integer object type is immutable. This means that every time we want to change the value of an immutable object type (such as integer or string), Python is going to create a fresh object that holds the required value. For immutable types this is straight-forward and makes the alteration variables quite safe since it does not impact the values of existing objects as in-place changes are not applicable on immutable object types.
However, this is not the case for mutable types and programmers must be aware of the way Python deals with mutable object types in order to avoid any unwanted behaviours and bugs in the code.
Mutable object types enable in-place changes which means that when their value is modified, there is an impact on all variables referencing that object. Such object types include lists, dictionaries and sets. To illustrate this concept let’s consider the following example where we have two lists holding the reference to the same list object:
list_1 = [1, 2, 3]
list_2 = list_1
Now let’s start with the easy use-case where we want to assign to the second list a newly created list:
list_1 = [1, 2, 3]
list_2 = list_1
list_2 = [2, 3, 4]
Now the two lists hold different references, each of which points to two distinct objects. Thus, if we make an in-place change to either list, this change won’t affect the other one since the two objects are different and they are stored in different location in memory.
Things are getting a bit more complicated when we attempt to modify a list object which is referenced by different variables. To illustrate this, let’s consider one more example shown below:
list_1 = [1, 2, 3]
list_2 = list_1
list_1[0] = 0
Now if we print out both lists, we’ll notice that even though we changed only list_1 in fact this had also impacted the contents of list_2 :
print(list_1)
print(list_2)
>>> [0, 2, 3]
>>> [0, 2, 3]
In this example, we haven’t changed list list_2 itself however the change in list_1 caused a change in list_2 . As we already explained, this has happened due to the fact that both list_2 references the same object as list_1. Therefore, an in-place change to an object impacts all variables referencing it. And once again, it is important to highlight that this behaviour is observed only for mutable object types and you must be aware of the way in-place changes behave in Python so that you don’t accidentally introduce bugs to your code.
In most of the cases, this behaviour is what programmers actually want to see in their code. However, there are also numerous use-cases where this is not desired and alternative solutions need to be considered so that a change in one variable does not impact other variables referencing the same object. If that’s the case then copying such objects is the way forward.
Copying objects in Python
Python comes with a built-in package called copy that offers functionality for copying objects. The two copy types are shallow and deep and their difference relates to whether you have to deal compound objects, that is objects containing other objects – for instance a list of dictionaries, or list of lists.
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
For instance, consider an example where we need to take a copy of a list and then modify one of the two variables. In this case, each variable will now point to a different object and thus an in-place change in one of the two objects won’t affect the other:
import copy
a = [1, 3, 4, 7]
b = copy.copy(a)
b[0] = -1
print(a)
print(b)
>>> [1, 3, 4, 7]
>>> [-1, 3, 4, 7]
However, shallow copies won’t do the trick when you have a compound object with nested mutable types – for instance a list of lists. In the example below, we can see that if we take a shallow copy of a list of lists, a change of the original list a or the original compound object c , the result will have effect on the copied list d:
import copy
a = [1, 3, 5, 7]
b = [2, 4, 6, 8]
c = [a, b]
d = copy.copy(c)
a[0] = -1
c[0][1] = -3
print(d)
>>> [[-1, 3, 5, 7], [2, 4, 6, 8]]
This is because a shallow copy does not create a new object for the nested instances but instead, it copies their reference to the original object. In most of the cases we typically need to create a new object even for nested instances so that the copied compound object is completely independent to the old one. In Python this is called a deep copy.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
Now if object d is a shallow copy of c , a change to either a , b or c won’t affect the copied list:
import copy
a = [1, 3, 5, 7]
b = [2, 4, 6, 8]
c = [a, b]
d = copy.deepcopy(c)
a[0] = -1
c[0][1] = -3
print(d)
>>> [[1, 3, 5, 7], [2, 4, 6, 8]]
Object equality explained
Due to the way Python reference model works, it also comes with two possible ways to perform equality checks between objects. But it always comes down to what exactly you want to check. To check if two objects have the same values you can use operator == . On the other hand, if you need to check whether two variables point to the same object you need to use operator is .
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> c = a
>>>
>>> a == b # both lists have the same values
True
>>> a == c # both lists have the same values
True
>>> a is c # variables a and c point to the same object
True
>>> a is b # variables a and b point to a different object
False
The big disadvantage of dynamically typed languages
Even though the dynamic typing model offers flexibility, it can also cause troubles when creating Python applications. The main advantage of statically typed languages such as Java, is that checks are done at compile-time and thus bugs can be identified early on.
On the other hand, dynamically typed languages such as Python boost developers’ productivity however users need to be extra careful due to the fact that errors will be reported only during runtime. For example, consider the scenario where the same variable points to different object types throughout the code (and note that this is not even allowed in most statically typed languages). You must ensure that only relevant operations are performed over that variable/object at certain points in the code so that you avoid invalid usage of functions or methods which are not applicable to that specific object type.
a = [1, 2, 3] # a is initially a list
a = 1 # Now a is an int
a[0] = 0 # Indexing can only be applied over sequences
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object does not support item assignment
Conclusion
In this article, we explored one of the fundamental characteristics of Python which is its dynamic typing model. We discussed about the relationship between variables, objects and references as well as the details of Python’s reference model. We’ve seen how shared references work and how to avoid unwanted behaviours when dealing with mutable object types.
Dynamic typing model is a powerful characteristic for programming languages however programmers need to be extra careful when writing applications in dynamically typed languages such as Python since bugs are much easier to be introduced by mistake.
Share This Article
Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.
Write for TDS