2014-03-21

Call by what? Understanding Python variables

You might run across something similar to this in a Python program:

consumer_x = consume(consumer_x, banana, milk)
weigh(consumer_x)

def consume(consumer, food, beverage):
    consumer.eat(food)
    consumer.drink(beverage)
    return consumer

This code is a bit more convoluted than it has to be. To understand why, we need to understand how Python variables work.

Labeled Shoeboxes

A variable in an old language like C is like a box with a label. (For some reason I thought shoeboxes when I studied C.) It has a certain size, e.g. big enough to fit an integer, or a double precision floating point number. You can write something like this in C:

int a = 12345;
int b;
b = a;
a = a + b;
b = a;

This roughly means the following:
1. Make an integer sized box, label it "a", and place the value 12345 in it.
2. Make another integer sized box and label it "b". Leave it empty for now.
3. Copy the value of a into the b box. Now both a and b contain 12345.
4. Calculate 12345 + 12345 = 24690 and put that in the a box.
5. Let's copy the value of a into b again. Both boxes contain 24690 now.

Call by what?

If you do something like "consume(consumer, food, beverage)" in a language with C-like variables, you need to make up your mind about the semantics here. What's going on with the variable "a" if you pass it to something like consume?

Will the value in the "a box" be copied into some other box inside consume? We would call this call by value or pass by value.

The other option would be that the code inside consume would be able to access the "a box". We call that call by reference.

So, what about Python? Is it call by value or call by reference? Neither I'd say. Some say call by object or call by sharing. The experienced C programmer would probably say that we're passing a pointer to "a" by value, but let me explain this with something more similar to the shoebox.

Balloon, tags and strings

For some reason, I see Python variables like balloons floating in the sky. They can take any size, and while you can't really control where they are, there are strings attached. You hold on to the other end of the string, and you've put a tag on it.

Let us repeat the example from above in a Pythonic way:

a = 12345
b = a
a = a + b
b = a

1. Let's make a balloon and put the integer 12345 in it. Attach a string and tag it "a" in the end you hold on to. (In fact, this balloon will never contain any other value than 12345, but we'll get back to that.)
2. Let's attach a new string to the 12345 balloon, and tag that "b". Now there are two strings attached to 12345.
3. Let's make a new balloon and fill it with 12345 + 12345 = 24690. Move the string tagged "a" from the 12345 balloon to the 24690 balloon. The balloons now have a string each.
4. Move the string tagged "b" to the balloon with the "a" string attached, i.e. the 24690 balloon. The 12345 balloon no longer has any strings attached, so it flies away.

A more technical term is that 12345 get's garbage collected (or to be particular, it will be reference counted if it's the standard version of Python).

Mutable and immutable types

All Python objects (oops, I meant balloons) have a type. Every type is either mutable or immutable. Integers are immutable, so once you've created an integer balloon (or object) it will never change its value. So, while "a = a + 1" means "change the value in the a-box" in C, it means "move the a-tagged string to a balloon with another value" in Python.

That's true for both mutable and immutable types: Assigment in Python always means "move the string to a new balloon". The silly thing with the example we began with, is that it's the same balloon it was already attached to. Let's go back to that:

consumer_x = consume(consumer_x, banana, milk)

def consume(consumer, food, beverage):
    consumer.eat(food)
    consumer.drink(beverage)
    return consumer

Before we call consume(...), consumer_x is a tagged string attached to some balloon. When we call consume() we attach the string with a consumer-tag inside the consume() function to the same balloon. Then we call the .eat() and .drink() methods on our ballon. Then we pass our balloon back to the caller. Finally, we reuse our variable name and do consumer_x = consume(...). This means that we detatch the string from the balloon it was to connected to, and instead we ... yes that's right ... we re-attach it to the same ballon again. That's silly isn't it?

The crucial thing is that we don't reassign consumer in consume. Since we can't see any "consumer = ..." in the body of consume, we know for sure that we return the same object as we got as input. Not much point in that. It's as if I would visit your home,  grab one of your flower pots from one of your window sills and present it to you as a gift.

For this function to make any sense at all, consumer is hardly of an immutable type, like an integer. It's probably an instance of a class. Along with e.g. lists, sets and dicts, that's a mutable type.

The difference between immutable and mutable, is that mutable objects can change (or mutate) after their creation. This is pretty important in Python. For instance, you can only use immutable values as keys in dicts. With an object such as a list, the value can change even though it's the same balloon a.k.a. object.

>>> a = 1
>>> print a, id(a)
1 30716560
>>> a += 2
>>> print a, id(a)
3 30716536
>>> # See, new value and new id, i.e. another object.
...
>>> l = []
>>> print l, id(l)
[] 37012744
>>> l.append('x')
>>> print l, id(l)
['x'] 37012744
>>> # New value, but still same object!
...

Let's make it a little more complicated...

If we look at the list above, it's balloon can obviously grow. A Python list is a like an array or vector in other languages, so if it's a list of 100 floating point numbers, its a big balloon. It won't contain 100 floats though. It will contain 100 strings, each leading to a floating point balloon.

So, the strings we attach to balloons can either end in another balloon, or they can have a tag in a location we call a scope. The balloons float in a part of the computer memory we call the heap, and the scopes with tags are in another part of the memory, called the stack. As long as you stick to Python, you don't really have to care about that.

Which is the variable?

In a C-like language, it's pretty obvious what a variable is. It's a labeled shoebox of a particular size/type. It's called variable, since its content can vary (within the constrains of the type). If you declare it const, it's not a variable, but a constant.

But what about Python? Which is actually the variable? The tag? The balloon? It's not the string, is it? If it's the tag, then Python variables don't have types, and that's a silly thing to claim. If it's the balloon, then Python don't have integer variables, just constants, and that would be an equally silly claim.

Perhaps it's the whole arrangement which is the variable. Perhaps the term variable doesn't make so much sense in Python? Maybe it's better to just talk about objects and names?

Want to know more? Take a look at Fredrik Lundh's explanation at http://effbot.org/zone/python-objects.htm

1 comment:

  1. I just needed to show my appreciation for the author's perspective regarding this matter by leaving a remark. Much thanks to you for composing such first rate content for your perusers. Thank you kindly. Either you are looking for tags with strings attached with or without strings. we can give you quality material, which last for years. Call and talk to our Designer.

    ReplyDelete