Python for Programmers: Consuming Iterators
Welcome to the Consuming Iterators lesson!
This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!
We've iterated over lists, tuples, strings, dictionaries, and ranges. All of those work with the
forloop. It's tempting to imagine that theforloop has special knowledge about each of those iterable types.Comprehensions can also iterate over lists, tuples, strings, dictionaries, and ranges. And many Python libraries expose custom data types that we can loop over with
forloops and comprehensions.forloops and comprehensions don't have special knowledge about lists, tuples, etc. Instead, iteration in Python relies on a powerful pair of ideas: iterables and iterators.First, a note on terminology. The words "iterable" and "iterator" aren't specific to Python; they're standard in many other languages. Although these terms might seem confusingly similar at first, their names can help us make sense of what they do.
If something's "edible", we can eat it. If something's "modifiable", we can modify it. And if something's "iterable", we can iterate over it. Lists, tuples, strings, dictionaries, and ranges are all iterable.
If an iterable is a thing that we can iterate over, then what's an iterator? When we write
for i in range(10), the loop needs to execute ten times. Python has to keep track of that iteration somewhere, like "we're on the 7th element, and there are still more elements left." That's the iterator's job.The iterable is the data source that we're looping over. The iterator tracks our progress while looping through the iterable.
Here's an iterable, a list of numbers.
>
numbers = [3, 2, 1]We get an iterator by calling
iter(numbers).iteris a built-in Python function that returns an iterator for an iterable.>
numbers = [3, 2, 1]numbers_iterator = iter(numbers)Now we can call
next(numbers_iterator)to get the first value innumbers.nextis another built-in function.>
numbers = [3, 2, 1]numbers_iterator = iter(numbers)next(numbers_iterator)Result:
3
The iterator tracks our progress through the list, so each
next(numbers_iterator)call gives us a new number. The first call tonextgives usnumbers[0], the second call gives usnumbers[1], etc.>
numbers = [3, 2, 1]numbers_iter = iter(numbers)next(numbers_iter) # This returns 3, but we don't store it anywhere.next(numbers_iter)Result:
2
What happens when we get to the end of the list? When the iterator has no more data to give us, we say that it's "exhausted". Calling
nexton an exhausted iterator raises aStopIterationexception.>
numbers = [3, 2, 1]numbers_iter = iter(numbers)first_item = next(numbers_iter)second_item = next(numbers_iter)third_item = next(numbers_iter)fourth_item = next(numbers_iter)fourth_itemResult:
StopIteration:
That
StopIterationexception is a normal Python exception object. But unlikeTypeError,ValueError, etc., it doesn't indicate a true error. It only means that we've reached the end of the iterator.In fact, Python uses this internally when running
forloops. It creates an iterator, then callsnexton it repeatedly untilnext(...)raisesStopIteration. The loop catches that exception, then stops looping.We can use this knowledge to mimic Python's built-in
forloop. First, here's some code that uses a regularfor.>
def double_each(numbers):results = []for n in numbers:results.append(n * 2)return resultsdouble_each([3, 2, 1])Result:
Now here's a version that imitates what
fordoes. Like aforloop, it callsnext(...)to get each value in the list. We stop iterating whennext(...)raisesStopIteration.(Although the
while Trueloop condition may seem dangerous, thebreakin the except clause safely terminates it, preventing an infinite loop.)>
def double_each(numbers):results = []numbers_iter = iter(numbers)while True:try:next_value = next(numbers_iter)results.append(next_value * 2)except StopIteration:breakreturn results- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
double_each([3, 2, 1])Result:
[6, 4, 2]
The
double_eachfunction works with any iterable. For example, we can pass it a dictionary where the keys are integers. It iterates over the keys, doubling each key. (It ignores the dictionary's values, since iterating over a dictionary only gives us the keys.)- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
double_each({10: "ten",20: "twenty"})Result:
[20, 40]
Here's a code problem:
Write an
enumerate_iteratorfunction that mimics the built-inenumeratefunction. It takes an iterator and returns a list of tuples. Each tuple contains that element's index, along with the element itself. For example,enumerate_iterator(["A", "B", "C"])should return[(0, "A"), (1, "B"), (2, "C")].You'll need to repeatedly call
next(iterator)until it raisesStopIteration. When looping, remember thatbreakexits the loop, butreturnexits the entire function. You'll need to add abreak, but you won't need to add any newreturns.# We overwrite the built-in `enumerate` function here. No calling it to cheat!def enumerate(*args, **kwargs):raise Exception("The built-in enumerate function is disabled!")def enumerate_into_list(iterator):results = []index = 0while True:try:value = next(iterator)results.append((index, value))index += 1except StopIteration:breakreturn resultsassert enumerate_into_list(iter(["A", "B"])) == [(0, "A"), (1, "B")]assert enumerate_into_list(iter(range(3))) == [(0, 0), (1, 1), (2, 2)]assert enumerate_into_list(iter("pew")) == [(0, "p"), (1, "e"), (2, "w")]assert enumerate_into_list(iter({"Amir": "Ms. Fluff","Betty": "Keanu"})) == [(0, "Amir"),(1, "Betty"),]- Goal:
None
- Yours:
None
StopIterationbrings up a contentious point in programming: should we use exceptions for control flow? One argument is that exceptions should only be used to signal errors, like "you passed the wrong kind of data" (TypeError) or "you passed a value that I don't know how to handle" (ValueError).As we just saw, iterators raise
StopIterationto signal that they're exhausted. But exhausting an iterator isn't an error; it's a normal part of using iterators! This is a case where Python "uses exceptions for control flow".We saw another case like this in an earlier lesson, where we tried to index into a dict, then caught the
KeyErrorwhen the key didn't exist. We could've asked the dict whether it has the key, but real-world Python code often tries to access the key, then handles theKeyErrorexception if it happens. In general, using exceptions for control flow is more common in Python than in other languages.So far, we've focused on
forloops. Lists also use the iteration protocol heavily. Internally,list(some_iterable)callsiter(some_iterable). Then it callsnexton the iterator until it raisesStopIteration, just like aforloop does.What if we create an iterator, iterate over some of its elements, and only then give it to
list? The answer is thatlistdoesn't know or care about the iterator's history! It always does the same thing: it callsnext(...)on the iterator until the iterator raisesStopIteration.In the example below, we create a three-element list, get an iterator for it, and call
nextonce. Then the iterator has two elements left. We calllist(...)on it, which gives us a list of those two elements.>
numbers = [3, 2, 1]numbers_iter = iter(numbers)next(numbers_iter) # This returns 3, but we don't store it anywhere.remaining = list(numbers_iter)remainingResult:
[2, 1]
That's a concrete example of why iterators exist. They're not collections of data, in the way that a list or dictionary is. Instead, an iterator's job is to give us a way to request the next value, making sure that we get each value in the correct order. In the example above, the iterator remembered where we were in the list: we'd already consumed one of the three list elements.
We get the same result no matter how we consume the iterator. For example, a
forloop gives us the same result that we saw above withlist(...).>
numbers = [3, 2, 1]numbers_iter = iter(numbers)next(numbers_iter) # This returns 3, but we don't store it anywhere.result = []for value in numbers_iter:result.append(value)resultResult:
[2, 1]
For convenience, functions that accept iterables or iterators usually accept either one. For example,
list(some_iterable)gives us the same result aslist(iter(some_iterable)). Internally,listcallsiter(some_iterable)for us.>
phone_book = {"Amir": "555-1234","Betty": "123-4567"}list(phone_book)Result:
['Amir', 'Betty']
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
list(iter(phone_book))Result:
['Amir', 'Betty']
This works because of an important property of iterators: calling
iteron an iterator returns the iterator itself. It's not just an equivalent iterator, but exactly the same iterator object. We can check that by doingiter(some_iterator) is some_iterator.>
numbers = [3, 2, 1]numbers_iter = iter(numbers)iter(numbers_iter) is numbers_iterResult:
True
We can create multiple iterators over the same iterable. For example, we might create two iterators over a list. Those two iterators are different objects, with different identities. They track their progress separately, so advancing one iterator doesn't affect the other.
>
numbers = [3, 2, 1]iter1 = iter(numbers)iter2 = iter(numbers)iter1 is iter2Result:
False
>
numbers = [3, 2, 1]iter1 = iter(numbers)iter2 = iter(numbers)next(iter1)Result:
3
>
numbers = [3, 2, 1]iter1 = iter(numbers)iter2 = iter(numbers)next(iter1)next(iter1)Result:
2
>
numbers = [3, 2, 1]iter1 = iter(numbers)iter2 = iter(numbers)next(iter1)next(iter1)next(iter2)Result:
3
Sometimes we only consume part of an iterator, then discard it.
>
def consume2(iterator):return [next(iterator), next(iterator)]- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
iterator = iter([1, 2, 3, 4, 5])consume2(iterator)Result:
[1, 2]
When the
consume2(...)call finishes, the iterator is still ready to loop over the remaining list elements. We can continue to consume its values if we want to.- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
iterator = iter([1, 2, 3, 4, 5])consume2(iterator)consume2(iterator)Result:
[3, 4]
Our list only contains five elements. If we call
consume2on the same iterator a third time, we run out of elements.consume2consumes the5, then callsnextagain, which raisesStopIteration. Ourconsume2function doesn't catch theStopIterationexception, so the exception escapes and causes an error.- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
iterator = iter([1, 2, 3, 4, 5])consume2(iterator)consume2(iterator)consume2(iterator)Result:
StopIteration:
We can summarize iterators in a short sentence: "calling
next(some_iterator)gives us the next value, or raisesStopIterationif there isn't one." But as we've seen, there are some subtleties behind that simple description.This lesson focused on consuming iterables and iterators, but we can also write our own iterables from scratch. In a future lesson, we'll write custom iterables that work with
forand comprehensions, just like built-in data types do.