RC00: Interesting Python behavior

2023-08-10

I'm attending the Fall 1 batch at Recurse Center! Posts in this series cover things I'm working on or find interesting during my time here.

I attended a pair programming workshop as part of the orientation talks. After a brief introduction to pairing and how it works, the attendees were paired with each other to code Conway's Game of Life within an hour.

My pairing partner, Manuel and I decided to code in Python, the first step being to create a grid. For some reason, my mind came up with this line of code:

grid=[['X']*10]*10

I don't recall seeing something like this before, so it came as a bit of a surprise. It seemed like a very neat trick and I was proud of myself. On running the code, it worked as intended, which was even more surprising.

$ python3 gameoflife.py
[['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X']]

The next stage was to set the initial state of the grid. This meant changing certain cells of the grid to another value, O in our case.

grid[5][5] = 'O'
grid[4][5] = 'O'
grid[4][6] = 'O'

I was sure this was going to work, but this happened instead.

$ python3 gameoflife.py
[['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X']]

Instead of specific cells, entire columns were updated? Strange.

The multiply sign raised some concerns. Was it duplicating the same object instead of each row being a different object?

Turns out, yes! However, I wasn't entirely convinced and needed some evidence to back this up. After the workshop I came across Python Tutor, a tool to visualize Python code, and decided to try it out.

Link to vizualization

The intended scenario would be to see 11 different arrays - the grid array, and 10 arrays containing 10 X's each. However, here we see 2 arrays - grid and one array with 10 X's, which I'll call row from now on.

Each index of grid points to row. So grid[0] == grid[1] ... == grid[9]. Any changes to row will reflect a change in grid[n], n being 0-9 in this case. Evidence found!

We can also see this in action using id() within Python, which prints out the memory address of an object.

>>> grid=[['X']*10]*10
>>> id(grid[0])
4315577152
>>> id(grid[1])
4315577152

The memory address for both rows is the same, indicating that they're pointing to the same object1.

How does this explain the two columns being changed? Here's what happened:

The solution to this is to create the grid a different way. Either a combination of two for-loops, or a list comprehension.

# For-loops
grid=[]
for i in range(10):
    grid.append([])
    for j in range(10):
        grid[i].append('X')

# List comprehension
grid=[['X' for _ in range(10)] for _ in range(10)]

The visualization for either of these approaches looks completely different from the earlier approach.

Link to vizualization

Instead of one row, there are now 10 rows. Changes to one row won't affect the other rows, which is the intended behavior.

In Python, id() returns different addresses for each row, further confirming that they're different objects.

>>> grid=[['X' for _ in range(10)] for _ in range(10)]
>>> id(grid[0])
4315575040
>>> id(grid[1])
4315559744

Footnotes

  1. This isn't an issue specific to lists of lists, but applies to lists of other items as well, for example a dictionary:

    >>> data = [{'x':1, 'y':2}]*10
    >>> id(data[0])
    4312977664
    >>> id(data[1])
    4312977664