Execute Program

Python for Programmers: Lines in Files

Welcome to the Lines in Files lesson!

This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!

  • Files and StringIOs have a .readlines method. It reads until the end of the file (or until the end of the StringIO's buffer), then returns a list of strings, one for each line of text. Note that every line ends in a line break, "\n".

  • >
    from io import StringIO

    buffer = StringIO("one\ntwo\nthree\n")
    buffer.readlines()
    Result:
  • In an earlier lesson, we saw the .split method for strings. At first, StringIO.readlines seems conceptually similar to some_string.split("\n"), but they differ in two important ways.

  • First, .split doesn't include the separator in the resulting list. For example, when we call some_string.split("\n"), the "\n"s don't show up in the individual strings.

  • >
    "abcd\nefgh".split("\n")
    Result:
    ['abcd', 'efgh']Pass Icon
  • Second, when we .split a string with a trailing separator, it dutifully splits on that final separator. That gives us an empty string at the end of the list. (In the next example, the list has three elements. The last element is "".)

  • >
    "abcd\nefgh\n".split("\n")
    Result:
    ['abcd', 'efgh', '']Pass Icon
  • The .readlines method differs on both of these details. First, it includes the newline character "\n" at the end of each string. Second, even when the file (or StringIO) ends with a newline character, .readlines doesn't include an empty string at the end of the list.

  • >
    from io import StringIO

    buffer = StringIO("abcd\nefgh\n")
    buffer.readlines()
    Result:
    ['abcd\n', 'efgh\n']Pass Icon
  • When the file doesn't end in a newline, the last string from .readlines doesn't have a \n character either.

  • >
    from io import StringIO

    buffer = StringIO("abcd\nefgh")
    buffer.readlines()
    Result:
    ['abcd\n', 'efgh']Pass Icon
  • These details may seem small, but they're very significant in practice. For example, if our code expects trailing \n characters but doesn't get them, it'll probably break. Similarly, if our code doesn't expect trailing \n characters but gets them, it'll also probably break.

  • .readlines modifies the stream position, like .read and .write do.

  • >
    from io import StringIO

    buffer = StringIO("abcd\nefgh\n")
    buffer.seek(2)
    buffer.readlines()
    Result:
    ['cd\n', 'efgh\n']Pass Icon
  • Note: this code example reuses elements (variables, etc.) defined in earlier examples.
    >
    # Remember that "\n" is a single character, not two characters!
    buffer.tell()
    Result:
    10Pass Icon
  • There's one more thing to know about reading files and StringIO. We often think of files as sequences of characters. We can .read a file (or StringIO) then iterate over the returned string to see one character at a time.

  • >
    from io import StringIO

    buffer = StringIO("cat")
    chars = []
    for char in buffer.read():
    chars.append(char)
    chars
    Result:
  • In that code example, we explicitly called .read, then iterated over the string that we got back. However, if we write for x in some_stringio or for x in some_file, we'll iterate over the file's lines, like we'd get by calling some_stringio.readlines().

  • >
    from io import StringIO

    buffer = StringIO("1\n5\n10\n12\n")
    total = 0
    for line in buffer:
    total += int(line)
    total
    Result:
    28Pass Icon
  • The most important thing to remember about .readlines is its newline behavior. If a file contains "a\nb", then the lines are ["a\n", "b"]!