Execute Program

Python for Programmers: Consuming Iterators

Welcome to the Consuming Iterators lesson!

This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!

  • We've iterated over lists, tuples, strings, dictionaries, and ranges. All of those work with the for loop. It's tempting to imagine that the for loop has special knowledge about each of those iterable types.

  • Comprehensions can also iterate over lists, tuples, strings, dictionaries, and ranges. And many Python libraries expose custom data types that we can loop over with for loops and comprehensions.

  • for loops and comprehensions don't have special knowledge about lists, tuples, etc. Instead, iteration in Python relies on a powerful pair of ideas: iterables and iterators.

  • First, a note on terminology. The words "iterable" and "iterator" aren't specific to Python; they're standard in many other languages. Although these terms might seem confusingly similar at first, their names can help us make sense of what they do.

  • If something's "edible", we can eat it. If something's "modifiable", we can modify it. And if something's "iterable", we can iterate over it. Lists, tuples, strings, dictionaries, and ranges are all iterable.

  • If an iterable is a thing that we can iterate over, then what's an iterator? When we write for i in range(10), the loop needs to execute ten times. Python has to keep track of that iteration somewhere, like "we're on the 7th element, and there are still more elements left." That's the iterator's job.

  • The iterable is the data source that we're looping over. The iterator tracks our progress while looping through the iterable.

  • Here's an iterable, a list of numbers.

  • >
    numbers = [3, 2, 1]
  • We get an iterator by calling iter(numbers). iter is a built-in Python function that returns an iterator for an iterable.

  • >
    numbers = [3, 2, 1]
    numbers_iterator = iter(numbers)
  • Now we can call next(numbers_iterator) to get the first value in numbers. next is another built-in function.

  • >
    numbers = [3, 2, 1]
    numbers_iterator = iter(numbers)
    next(numbers_iterator)
    Result:
    3Pass Icon
  • The iterator tracks our progress through the list, so each next(numbers_iterator) call gives us a new number. The first call to next gives us numbers[0], the second call gives us numbers[1], etc.

  • >
    numbers = [3, 2, 1]
    numbers_iter = iter(numbers)
    next(numbers_iter) # This returns 3, but we don't store it anywhere.
    next(numbers_iter)
    Result:
    2Pass Icon
  • What happens when we get to the end of the list? When the iterator has no more data to give us, we say that it's "exhausted". Calling next on an exhausted iterator raises a StopIteration exception.

  • >
    numbers = [3, 2, 1]
    numbers_iter = iter(numbers)
    first_item = next(numbers_iter)
    second_item = next(numbers_iter)
    third_item = next(numbers_iter)
    fourth_item = next(numbers_iter)
    fourth_item
    Result:
    StopIteration: Pass Icon
  • That StopIteration exception is a normal Python exception object. But unlike TypeError, ValueError, etc., it doesn't indicate a true error. It only means that we've reached the end of the iterator.

  • In fact, Python uses this internally when running for loops. It creates an iterator, then calls next on it repeatedly until next(...) raises StopIteration. The loop catches that exception, then stops looping.

  • We can use this knowledge to mimic Python's built-in for loop. First, here's some code that uses a regular for.

  • >
    def double_each(numbers):
    results = []
    for n in numbers:
    results.append(n * 2)
    return results

    double_each([3, 2, 1])
    Result:
  • Now here's a version that imitates what for does. Like a for loop, it calls next(...) to get each value in the list. We stop iterating when next(...) raises StopIteration.

  • (Although the while True loop condition may seem dangerous, the break in the except clause safely terminates it, preventing an infinite loop.)

  • >
    def double_each(numbers):
    results = []
    numbers_iter = iter(numbers)

    while True:
    try:
    next_value = next(numbers_iter)
    results.append(next_value * 2)
    except StopIteration:
    break

    return results
  • Note: this code example reuses elements (variables, etc.) defined in earlier examples.
    >
    double_each([3, 2, 1])
    Result:
    [6, 4, 2]Pass Icon
  • The double_each function works with any iterable. For example, we can pass it a dictionary where the keys are integers. It iterates over the keys, doubling each key. (It ignores the dictionary's values, since iterating over a dictionary only gives us the keys.)

  • Note: this code example reuses elements (variables, etc.) defined in earlier examples.
    >
    double_each({
    10: "ten",
    20: "twenty"
    })
    Result:
    [20, 40]Pass Icon
  • Here's a code problem:

    Write an enumerate_iterator function that mimics the built-in enumerate function. It takes an iterator and returns a list of tuples. Each tuple contains that element's index, along with the element itself. For example, enumerate_iterator(["A", "B", "C"]) should return [(0, "A"), (1, "B"), (2, "C")].

    You'll need to repeatedly call next(iterator) until it raises StopIteration. When looping, remember that break exits the loop, but return exits the entire function. You'll need to add a break, but you won't need to add any new returns.

    # We overwrite the built-in `enumerate` function here. No calling it to cheat!
    def enumerate(*args, **kwargs):
    raise Exception("The built-in enumerate function is disabled!")
    def enumerate_into_list(iterator):
    results = []
    index = 0
    while True:
    try:
    value = next(iterator)
    results.append((index, value))
    index += 1
    except StopIteration:
    break

    return results
    assert enumerate_into_list(iter(["A", "B"])) == [(0, "A"), (1, "B")]
    assert enumerate_into_list(iter(range(3))) == [(0, 0), (1, 1), (2, 2)]
    assert enumerate_into_list(iter("pew")) == [(0, "p"), (1, "e"), (2, "w")]
    assert enumerate_into_list(iter({
    "Amir": "Ms. Fluff",
    "Betty": "Keanu"
    })) == [
    (0, "Amir"),
    (1, "Betty"),
    ]
    Goal:
    None
    Yours:
    NonePass Icon
  • StopIteration brings up a contentious point in programming: should we use exceptions for control flow? One argument is that exceptions should only be used to signal errors, like "you passed the wrong kind of data" (TypeError) or "you passed a value that I don't know how to handle" (ValueError).

  • As we just saw, iterators raise StopIteration to signal that they're exhausted. But exhausting an iterator isn't an error; it's a normal part of using iterators! This is a case where Python "uses exceptions for control flow".

  • We saw another case like this in an earlier lesson, where we tried to index into a dict, then caught the KeyError when the key didn't exist. We could've asked the dict whether it has the key, but real-world Python code often tries to access the key, then handles the KeyError exception if it happens. In general, using exceptions for control flow is more common in Python than in other languages.

  • So far, we've focused on for loops. Lists also use the iteration protocol heavily. Internally, list(some_iterable) calls iter(some_iterable). Then it calls next on the iterator until it raises StopIteration, just like a for loop does.

  • What if we create an iterator, iterate over some of its elements, and only then give it to list? The answer is that list doesn't know or care about the iterator's history! It always does the same thing: it calls next(...) on the iterator until the iterator raises StopIteration.

  • In the example below, we create a three-element list, get an iterator for it, and call next once. Then the iterator has two elements left. We call list(...) on it, which gives us a list of those two elements.

  • >
    numbers = [3, 2, 1]
    numbers_iter = iter(numbers)
    next(numbers_iter) # This returns 3, but we don't store it anywhere.
    remaining = list(numbers_iter)
    remaining
    Result:
    [2, 1]Pass Icon
  • That's a concrete example of why iterators exist. They're not collections of data, in the way that a list or dictionary is. Instead, an iterator's job is to give us a way to request the next value, making sure that we get each value in the correct order. In the example above, the iterator remembered where we were in the list: we'd already consumed one of the three list elements.

  • We get the same result no matter how we consume the iterator. For example, a for loop gives us the same result that we saw above with list(...).

  • >
    numbers = [3, 2, 1]
    numbers_iter = iter(numbers)
    next(numbers_iter) # This returns 3, but we don't store it anywhere.

    result = []
    for value in numbers_iter:
    result.append(value)
    result
    Result:
    [2, 1]Pass Icon
  • For convenience, functions that accept iterables or iterators usually accept either one. For example, list(some_iterable) gives us the same result as list(iter(some_iterable)). Internally, list calls iter(some_iterable) for us.

  • >
    phone_book = {
    "Amir": "555-1234",
    "Betty": "123-4567"
    }
    list(phone_book)
    Result:
    ['Amir', 'Betty']Pass Icon
  • Note: this code example reuses elements (variables, etc.) defined in earlier examples.
    >
    list(iter(phone_book))
    Result:
    ['Amir', 'Betty']Pass Icon
  • This works because of an important property of iterators: calling iter on an iterator returns the iterator itself. It's not just an equivalent iterator, but exactly the same iterator object. We can check that by doing iter(some_iterator) is some_iterator.

  • >
    numbers = [3, 2, 1]
    numbers_iter = iter(numbers)
    iter(numbers_iter) is numbers_iter
    Result:
    TruePass Icon
  • We can create multiple iterators over the same iterable. For example, we might create two iterators over a list. Those two iterators are different objects, with different identities. They track their progress separately, so advancing one iterator doesn't affect the other.

  • >
    numbers = [3, 2, 1]
    iter1 = iter(numbers)
    iter2 = iter(numbers)
    iter1 is iter2
    Result:
    FalsePass Icon
  • >
    numbers = [3, 2, 1]
    iter1 = iter(numbers)
    iter2 = iter(numbers)
    next(iter1)
    Result:
    3Pass Icon
  • >
    numbers = [3, 2, 1]
    iter1 = iter(numbers)
    iter2 = iter(numbers)
    next(iter1)
    next(iter1)
    Result:
    2Pass Icon
  • >
    numbers = [3, 2, 1]
    iter1 = iter(numbers)
    iter2 = iter(numbers)
    next(iter1)
    next(iter1)
    next(iter2)
    Result:
    3Pass Icon
  • Sometimes we only consume part of an iterator, then discard it.

  • >
    def consume2(iterator):
    return [next(iterator), next(iterator)]
  • Note: this code example reuses elements (variables, etc.) defined in earlier examples.
    >
    iterator = iter([1, 2, 3, 4, 5])
    consume2(iterator)
    Result:
    [1, 2]Pass Icon
  • When the consume2(...) call finishes, the iterator is still ready to loop over the remaining list elements. We can continue to consume its values if we want to.

  • Note: this code example reuses elements (variables, etc.) defined in earlier examples.
    >
    iterator = iter([1, 2, 3, 4, 5])
    consume2(iterator)
    consume2(iterator)
    Result:
    [3, 4]Pass Icon
  • Our list only contains five elements. If we call consume2 on the same iterator a third time, we run out of elements. consume2 consumes the 5, then calls next again, which raises StopIteration. Our consume2 function doesn't catch the StopIteration exception, so the exception escapes and causes an error.

  • Note: this code example reuses elements (variables, etc.) defined in earlier examples.
    >
    iterator = iter([1, 2, 3, 4, 5])
    consume2(iterator)
    consume2(iterator)
    consume2(iterator)
    Result:
    StopIteration: Pass Icon
  • We can summarize iterators in a short sentence: "calling next(some_iterator) gives us the next value, or raises StopIteration if there isn't one." But as we've seen, there are some subtleties behind that simple description.

  • This lesson focused on consuming iterables and iterators, but we can also write our own iterables from scratch. In a future lesson, we'll write custom iterables that work with for and comprehensions, just like built-in data types do.