I attended a pair programming workshop as part of the orientation talks. After a brief introduction to pairing and how it works, the attendees were paired with each other to code Conway's Game of Life within an hour.

My pairing partner, Manuel and I decided to code in Python, the first step being to create a grid. For some reason, my mind came up with this line of code:

grid=[['X']*10]*10

I don't recall seeing something like this before, so it came as a bit of a surprise. It seemed like a very neat trick and I was proud of myself. On running the code, it worked as intended, which was even more surprising.

$ python3 gameoflife.py
[['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X']]

The next stage was to set the initial state of the grid. This meant changing certain cells of the grid to another value, O in our case.

grid[5][5] = 'O'
grid[4][5] = 'O'
grid[4][6] = 'O'

I was sure this was going to work, but this happened instead.

$ python3 gameoflife.py
[['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X'],
['X', 'X', 'X', 'X', 'X', 'O', 'O', 'X', 'X', 'X']]

Instead of specific cells, entire columns were updated? Strange.

The multiply sign raised some concerns. Was it duplicating the same object instead of each row being a different object?

Turns out, yes! However, I wasn't entirely convinced and needed some evidence to back this up. After the workshop I came across Python Tutor, a tool to visualize Python code, and decided to try it out.

There are three blocks - a global frame with the variable name "grid". The variable name points to block representing an array of index 10. Every index of the array points to a third block - an array of index 10 with the string "X" written for each index.
Click to view on Python Tutor

The intended scenario would be to see 11 different arrays - the grid array, and 10 arrays containing 10 X's each. However, here we see 2 arrays - grid and one array with 10 X's, which I'll call row from now on.

Each index of grid points to row. So grid[0] == grid[1] ... == grid[9]. Any changes to row will reflect a change in grid[n], n being 0-9 in this case. Evidence found!

We can also see this in action using id() within Python, which prints out the memory address of an object.

>>> grid=[['X']*10]*10
>>> id(grid[0])
4315577152
>>> id(grid[1])
4315577152

The memory address for both rows is the same, indicating that they're pointing to the same object1.

How does this explain the two columns being changed? Here's what happened:

The solution to this is to create the grid a different way. Either a combination of two for-loops, or a list comprehension.

# For-loops
grid=[]
for i in range(10):
    grid.append([])
    for j in range(10):
        grid[i].append('X')

# List comprehension
grid=[['X' for _ in range(10)] for _ in range(10)]

The visualization for either of these approaches looks completely different from the earlier approach.

There are a total of tweleve blocks - a global frame with the variable name "grid". The variable name points to block representing an array of index 10. Each index of the array points to a separate block - an array of index 10 with the string "X" written for each index.
Click to view on Python Tutor

Instead of one row, there are now 10 rows. Changes to one row won't affect the other rows, which is the intended behavior.

In Python, id() returns different addresses for each row, further confirming that they're different objects.

>>> grid=[['X' for _ in range(10)] for _ in range(10)]
>>> id(grid[0])
4315575040
>>> id(grid[1])
4315559744

Notes

  1. This isn't an issue specific to lists of lists, but applies to lists of other items as well, for example a dictionary:

    >>> data = [{'x':1, 'y':2}]*10
    >>> id(data[0])
    4312977664
    >>> id(data[1])
    4312977664