Execute Program

Python for Programmers: Mixing up Iterables

Welcome to the Mixing up Iterables lesson!

This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!

  • It's common to split a string, then manipulate only a single element from the split. In the next example, we take a file path in the Unix style, where Amir's files are stored in /home/amir. We extract the username component of that path.

  • (In the next example, remember that the list returned by .split includes an empty string, "", when the string starts with the separator.)

  • >
    def extract_username(path):
    path_components = path.split("/")
    username = path_components[2]
    return username

    extract_username("/home/amir/proj/")
    Result:
    'amir'Pass Icon
  • Now we'll add a simple bug to that code. Instead of username = path_components[2], we'll accidentally do username = path[2]. This doesn't raise an exception, since path is a string and indexing into a string is perfectly legal: it returns the character at that index.

  • >
    def extract_username(path):
    path_components = path.split("/")
    # Mistakenly refer to `path`, not `path_components`.
    username = path[2]
    return username

    extract_username("/home/amir/proj/")
    Result:
    'o'Pass Icon
  • This common mistake happens because Python's lists and strings both support the indexing operator. It doesn't result in an exception, which can make it hard to debug. When this happens, the symptom will be that "o" (or some other single character) will show up as a username somewhere. It might show up in the user interface, or in a log file, or in an exception message.

  • A similar issue can happen with iteration. Iterating over a string gives us a series of single-character strings.

  • >
    chars = []
    for character in "cat":
    chars.append(character)
    chars
    Result:
    ['c', 'a', 't']Pass Icon
  • In the next example, we want to know whether the path contains "home". We call .split on the path, then iterate over the list of splits.

  • >
    def home_in_path(path):
    for component in path.split("/"):
    if component == "home":
    return True
    return False

    home_in_path("/home/amir/proj/")
    Result:
    TruePass Icon
  • It's easy to forget the .split call. We might write for component in path instead of for component in path.split("/"). This unintentionally iterates over every character in path, so our if will never see the target string of "home". Once again, there's no exception, because iterating over a string is perfectly legal.

  • >
    path = "/home/amir/proj/"
    home_in_path = False

    for component in path:
    print(component)
    console output
  • Now let's introduce that mistake in the full code example, where we compute home_in_path.

  • >
    def home_in_path(path):
    # Mistakenly refer to `path`, not `path.split("/")`.
    for component in path:
    if component == "home":
    return True

    return False

    home_in_path("/home/amir/proj/")
    Result:
    FalsePass Icon
  • The code runs successfully, but we get False when we expect True. In this case we made the mistake intentionally, so we can see why that happened. But when changing a large system, this kind of mistake is very easy to miss. Unfortunately, we don't get much feedback about what's wrong: it just looks like "home" isn't in path when it should be.

  • We just saw two similar bugs, where we thought we had a list but we actually had a string. If you ever find your results filled with unexpected single-character strings, it's possible that you mixed up a string and list in this way.

  • Strings and lists are the most common culprits here, but this problem isn't specific to them. Similar problems can happen when we mix up any combination of strings, lists, tuples, dicts, and other iterable data types.

  • Here's a code problem:

    The name_list_is_alphabetical function below takes a list of names in a single string, with spaces separating the names. It returns True or False depending on whether the list is in alphabetical order. However, there's a bug. The code doesn't throw an exception, but it returns the wrong result, which makes the assertions fail. Find the bug and fix it so that the assertions pass.

    Hint 1: You can use the < operator to compare strings. For example, "a" < "b" is true, as is "ab" < "ac".

    Hint 2: look at how each variable is used. You don't need to add or remove any lines of code to fix the bug. You only need to change some variable references (places where the function uses one variable, but it should use a different variable).

    def name_list_is_alphabetical(names):
    names_list = names.split(" ")

    idx = 0
    while idx < len(names_list) - 1:
    if names_list[idx] > names_list[idx + 1]:
    return False
    idx += 1
    return True
    assert name_list_is_alphabetical("")
    assert name_list_is_alphabetical("Gabriel")
    assert not name_list_is_alphabetical("Amir Cindy Betty")
    assert name_list_is_alphabetical("Betty Cindy Gabriel")
    assert name_list_is_alphabetical("Amir Cindy Hana")
    assert not name_list_is_alphabetical("Hana Amir Gabriel")
    Goal:
    None
    Yours:
    NonePass Icon