RC00: Interesting Python behavior
I attended a pair programming workshop as part of the RC orientation talks. After a brief introduction to pairing and how it works, the attendees were paired with each other to code Conway's Game of Life within an hour.
My pairing partner, Manuel and I decided to code in Python, the first step being to create a grid. For some reason, my mind came up with this line of code:
grid=[['X']*10]*10
I don't recall seeing something like this before, so it came as a bit of a surprise. It seemed like a very neat trick and I was proud of myself. On running the code, it worked as intended, which was even more surprising.
$ python3 gameoflife.py
[['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X']]
The next stage was to set the initial state of the grid. This meant changing certain cells of the grid to another value, O in our case.
grid[5][5] = 'O'
grid[4][5] = 'O'
grid[4][6] = 'O'
I was sure this was going to work, but this happened instead.
$ python3 gameoflife.py
[['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X']]
Instead of specific cells, entire columns were updated? Strange.
My attention went towards the multiply sign. Was it duplicating the same object instead of each row being a different object?
Turns out, yes! However, I wasn't entirely convinced and needed some evidence to back this up. After the workshop I came across Python Tutor, a tool to visualize Python code, and decided to try it out.
The intended scenario would be to see 11 different arrays - the grid array, and 10 arrays containing 10 X's each. However, here we see 2 arrays - grid and one array with 10 X's, which I'll call row from now on.
Each index of grid points to row. So grid[0] == grid[1] ... == grid[9]. Any changes to row will reflect a change in grid[n], n being 0-9 in this case. Evidence found!
We can also see this in action using id() within Python, which prints out the memory address of an object.
>>> grid=[['X']*10]*10
>>> id(grid[0])
4315577152
>>> id(grid[1])
4315577152
The memory address for both rows is the same, indicating that they're pointing to the same object.
This property is true for lists of other items as well, for example a dictionary:
>>> data = [{'x':1, 'y':2}]*10
>>> id(data[0])
4312977664
>>> id(data[1])
4312977664
How does this explain the two columns being changed? Here's what happened:
- The indexes we tried to change were
grid[5][5],grid[4][5]andgrid[4][6]. grid[4]andgrid[5]are pointing torow, so indexes 5 and 6 were changed inrow.- As all indexes of
gridpoint torow, printinggridbasically printsrow10 times, giving the illusion of columns being changed.
The solution to this is to create the grid a different way. Either a combination of two for-loops, or a list comprehension.
# For-loops
grid=[]
for i in range(10):
grid.append([])
for j in range(10):
grid[i].append('X')
# List comprehension
grid=[['X' for _ in range(10)] for _ in range(10)]
The visualization for either of these approaches looks completely different from the earlier approach.
Instead of one row, there are now 10 rows. Changes to one row won't affect the other rows, which is the intended behavior.
In Python, id() returns different addresses for each row, further confirming that they're different objects.
>>> grid=[['X' for _ in range(10)] for _ in range(10)]
>>> id(grid[0])
4315575040
>>> id(grid[1])
4315559744