The Magic Behind Python Generator Functions

Photo by Kaique Rocha

Generator Functions are one of the coolest features of the Python programming language. There are numerous articles on the web describing the many benefits generator functions provide in terms of speed, scalability and memory efficiency of our python programs. However, there is not much material out there which sheds light on how generator functions actually work behind the scenes. This article attempts to fill this void by shedding light on some of the key features of the python programming language which make generator functions possible.

The fundamental feature which gives generator functions their superpowers is the ability that a generator function can be paused and then resumed at any time from any function. The local state of the generator function is kept intact after the function is paused and is available when the function is resumed again. How is that possible? How can a function be paused and then resumed with its local state kept intact? For all that we know, functions have a single entry point and multiple exit points (return statements). Each time we call a function, the code executes beginning from the first line of the function until it encounters an exit point. At that juncture, control is returned to the caller of the function and the function’s stack of local variables is cleared and the associated memory reclaimed by the OS.

Generator functions however don’t behave this way. They have multiple entry and exit points. Each yield statement in a generator function simultaneously defines an exit point and a re-entry point. Execution of a generator function continues until a yield statement is encountered. At that point, the local state of the function is preserved and the flow of control is yielded to the caller of the generator function. When the generator function is resumed (by calling next, send or by iterating through a for loop), it’s last known local state is conjured up and execution begins from the line following the yield statement at which the generator function was last paused. This behavior is quite mind boggling and does not conform to how functions normally behave.

To try and understand the magic behind generator functions, let’s begin by taking a closer look at a normal function:

Each time add_two_numbers is called, we expect the CPython interpreter to create a new stack frame object and execute the add_two_numbers function in the context of this object. We expect the local variable s to get pushed onto this stack frame and remain there until the function exits. At function exit, we expect the associated stack frame to be cleared out and the corresponding memory to be reclaimed. Let’s confirm that this is the case:

We use the builtin inspect module to capture the current frame of execution of the add_two_numbers function. Towards the end , we print the stack frame object and any local variables associated with it. We would expect the stack frame to be empty and consequently have no local variables. Let’s go ahead and execute the above the code snippet:

WHAT?! The stack frame and all its associated local variables are still hanging around after the call to the add_two_numbers concludes! What is happening here? Have we stumbled across a memory leak in CPython? No, that is not the case. This observation leads us to one of the fundamental characteristics of Python stack frames: Python stack frames are not allocated on stack memory. Instead, they are allocated on heap memory . What this essentially means is that python stack frames can outlive their respective function calls! Generator functions leverage this behavior to do their magic.

When the CPython compiler encounters the yield statement in a function, it sets a flag on the compiled code object to tell the CPython interpreter that the function is a generator function. We can use the dis module to see this in action:

When the CPython interpreter sees the GENERATOR flag on the code object associated with a function, it does not execute the function but instead returns a generator object. The generator object is an iterator. Which means that we can iterate through it using the next keyword or a for loop.

As we iterate through the generator function, execution continues until a yield statement is encountered. At that point, the stack frame of the function is frozen and control is returned back to the caller of the generator function.

As we continue advancing through the generator function by calling next or via for loop, execution begins precisely from the point where it last left off ( the last yield statement). How does the CPython interpreter know where execution of an instance of a generator function was last stopped? It knows this via the stack frame object associated with the generator instance being executed.

We saw earlier that Python stack frames are allocated on heap memory and their state is preserved between subsequent calls to next or send on an instance of a generator function (Of Course each new instance of a generator function gets a new stack frame). Besides storing information about the local and global variables, a python stack frame encapsulates other useful bits of information. One such piece of information is the last instruction pointer.

The last instruction pointer is an index into the bytecode string of the code object associated with the body of the generator function and points to the last bytecode instruction that was run within the context of a stack frame. When an instance of a generator function is resumed, the CPython interpreter uses the last instruction pointer on the associated stack frame to determine where to begin executing the code object of the generator function . We can see this interactively using the handy disco method provided by the dis module :

We create a simple generator function that yields two numbers. Calling the generator function creates and returns a generator object. No code in the generator function’s body gets executed during this call and the last instruction pointer is initialized to -1. As we begin executing the generator function by calling next, the last instruction pointer advances from one yield statement to another (current position of the last instruction pointer after each call to next denoted by → in the above code snippets), pausing and then resuming from the same point, until the generator function is exhausted and a StopIteration exception is thrown.

Wrapping it up, the key thing to remember is that python generators encapsulate a stack frame and a code object. The stack frame is allocated on heap memory and holds a pointer to the last bytecode instruction that was run on a code object within the context of the stack frame. It is the last instruction pointer which tells the CPython interpreter which line to execute next when a generator function is resumed. These are the core building blocks over which generator functions thrive.

If you are feeling adventurous, you can take at the _PyEval_EvalCodeWithName function in Python/ceval.c and the Python/genobject.c module in the CPython source to see the implementation details. This blog was written using the source for CPython 3.6.

If generator functions were a mystery to you, hopefully this post has helped offer some clarity on how generator functions work.