Tuesday, January 27. 2009

Global and local variables in python

I just stumbled over my python code, similar to the following one, which raises an UnboundLocalError in the first line of push:

def func_stack1():
    stack = []
    def push(num):
        print "stack before: ", stack
        stack = stack + [num]
        print "stack after: ", stack
    return push

push = func_stack1()
push(1)

func_stack1() defines a local function push and returns it. push uses the stack variable defined in func_stack1 to push numbers on a stack. The code raises the following error:

[...]
  File "./test.py", line 6, in push
    print "stack before: ", stack
UnboundLocalError: local variable 'stack' referenced before assignment

The error happens the first time push wants to print the stack.

However, the following code, which uses append instead of + to fill the stack, works just fine:

def func_stack2():
    stack = []
    def push(num):
        print "stack before: ", stack
        stack.append(num)
        print "stack after: ", stack
    return push

push = func_stack2()
push(1)

Why is that? Python's naming and binding contains a "subtle rule": if you assign to (i.e., "bind") a name anywhere in a function, the compiler assumes this name to mean a local variable (unless you tell the compiler with global that the name refers to a global variable).

In func_stack1, the compiler sees that you assign something to stack in push and thus assumes that stack is a local variable local in push. When push tries to print stack for the first time, stack has not been bound to anything yet, thus the error.

In func_stack2, you don't assign anything to stack but manipulate it directly (using append). As there is no assignment to stack, the compiler concludes that stack is a free variable in push and therefore looks at the outer scope of push to find the binding for stack.

As far as I know, there is unfortunately no way in Python to tell the compiler that stack in push of func_stack1 refers to a variable defined in func_stack1; the global keyword can only refer to the namespace at module level. The 'work around' is to directly manipulate variables that lie between the global and local scope level.