Python in Detail: Generator Functions
Welcome to the Generator Functions lesson!
This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!
The iteration protocol lets us define our own iterators, which is very useful, but it requires a lot of work. We need to define an iterable class and an iterator class, and both classes need to implement the correct dunder methods:
.__iter__and.__next__.It's good to understand how iterators actually work internally by defining both of those methods. But in many cases, we can avoid that complexity by using generators.
Any function with a
yieldexpression is a generator function. When we call the function, we get a generator object, which is an iterator that we can use anywhere we'd normally use an iterator: withnext(...), inforloops, in comprehensions, etc.The next example uses a generator to implement a simplified version of the built-in
rangefunction. We'll look at the function first, then analyze its behavior.>
def simple_range(start, end):current_value = startwhile current_value < end:yield current_valuecurrent_value += 1Each
yieldproduces one value for the iterator. When we callnext, we get those values back. In the case ofsimple_range(2, 7), we get an iterator with the values 2, 3, 4, 5, and 6.- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
my_iter = simple_range(2, 7)results = []results.append(next(my_iter))results.append(next(my_iter))results.append(next(my_iter))resultsResult:
[2, 3, 4]
The generator works with any code that consumes iterators. For example, instead of manually appending values to a list, we can use the built-in
list(...)function.- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
list(simple_range(2, 7))Result:
[2, 3, 4, 5, 6]
At first, it's tempting to think that the
simple_rangefunction runs to completion, producing a list of values, and those values show up in the iterator. But like a regular iterator, it's working with one value at a time.We can verify that by building a generator that
yields an infinite number of values. Python definitely isn't collecting all of the values in advance, because that's not possible!>
def positive_integers():i = 1while True:yield ii += 1my_iter = positive_integers()(next(my_iter), next(my_iter), next(my_iter))Result:
(1, 2, 3)
How can that infinite loop finish in a finite amount of time? The answer is that it didn't finish!
Here's an expanded version of that same example, to show more detail. We've added a few print statements to help us trace the execution.
>
def positive_integers():print("Generator starting")i = 1while True:print("Yielding", i)yield ii += 1print("Calling positive_integers")my_iter = positive_integers()# Only consume 3 elements so we don't end up looping infinitely.for _ in range(0, 3):print("Calling next")value = next(my_iter)print("Consumed", value)console outputWe'll start by only looking at the first three lines of the output.
When we call
positive_integers(), the function doesn't actually run. It only creates the iterator. We can see that from the fact that the "Calling positive_integers" and "Calling next" lines both print before "Generator starting". Then, when we callnext(my_iter)inside the loop, the generator function finally starts and prints "Generator starting".The generator function continues executing until it reaches a
yield. At that point, the generator has produced the next iterated value, so it stops executing. Thenext(my_iter)call finally returns with the return value 1. When we callnext(my_iter)again, the generator wakes up, runs until it encounters ayield, then stops again.Note that Python isn't calling the
positive_integersfunction repeatedly. Instead, eachyieldpauses execution of the function, remembering where it was: which line of code is it currently executing, what are the current values of the local variables, etc. When we callnextagain, the function unpauses and picks up exactly where it left off. It runs until it hits anotheryield, which gives us another value to satisfy thenextcall.At runtime, control continually switches between the generator and the
forloop consuming it. Execution repeatedly jumps from the generator to the loop to the generator to the loop and so on. Only one of the two is running at any given time.Here's a code problem:
Write a generator,
countdown, that takes a numbern, and then yields every number fromndown to (and including) 0.Don't forget to
yieldthe correct value!def countdown(n):while n >= 0:yield nn -= 1from_5 = countdown(5)assert next(from_5) == 5assert next(from_5) == 4assert list(from_5) == [3, 2, 1, 0]from_0 = countdown(0)assert list(from_0) == [0]- Goal:
None
- Yours:
None
Now that we've seen generators, let's review the terminology:
- Any function with a
yieldinside is a generator function. - When we call the generator function, we get a generator.
- The generator is a kind of iterator.
- Any function with a
>
def one():yield 1my_iter = one()type(my_iter).__name__Result:
Generators are iterators, but where is the corresponding iterable? Remember that an iterable (for example, a list or a dictionary) holds the data. An iterator is a separate object that tracks iteration progress. To get an iterator, we usually call
iter(some_iterable), likeiter([1, 2, 3]).None of the examples above called
iteron anything. That's because there is no iterable when using generators! We never have to calliteron anything.This may seem strange, because until now we always saw iterables and iterators appearing together. But it's perfectly fine to have an iterator without an iterable.
Generators can remove a lot of boilerplate code. For example, in an earlier lesson we wrote
PrimesandPrimesIteratorclasses, which iterated over all prime numbers. But it took a lot of code: two classes with five methods, just to generate prime numbers. With generators, we can do the same work in much less code!>
def primes():n = 2while True:if is_prime(n):yield nn += 1def is_prime(n):for i in range(2, n // 2 + 1):if n % i == 0:return Falsereturn Truemy_iter = primes()(next(my_iter), next(my_iter), next(my_iter), next(my_iter), next(my_iter))Result:
(2, 3, 5, 7, 11)
The generator function lets us skip all of the iterator protocol boilerplate. There are some situations where we need to implement the full iteration protocol, but many iterators can be written more easily as generators.