Python for Programmers: Lines in Files
Welcome to the Lines in Files lesson!
This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!
Files and
StringIOs have a.readlinesmethod. It reads until the end of the file (or until the end of theStringIO's buffer), then returns a list of strings, one for each line of text. Note that every line ends in a line break,"\n".>
from io import StringIObuffer = StringIO("one\ntwo\nthree\n")buffer.readlines()Result:
In an earlier lesson, we saw the
.splitmethod for strings. At first,StringIO.readlinesseems conceptually similar tosome_string.split("\n"), but they differ in two important ways.First,
.splitdoesn't include the separator in the resulting list. For example, when we callsome_string.split("\n"), the"\n"s don't show up in the individual strings.>
"abcd\nefgh".split("\n")Result:
['abcd', 'efgh']
Second, when we
.splita string with a trailing separator, it dutifully splits on that final separator. That gives us an empty string at the end of the list. (In the next example, the list has three elements. The last element is"".)>
"abcd\nefgh\n".split("\n")Result:
['abcd', 'efgh', '']
The
.readlinesmethod differs on both of these details. First, it includes the newline character"\n"at the end of each string. Second, even when the file (orStringIO) ends with a newline character,.readlinesdoesn't include an empty string at the end of the list.>
from io import StringIObuffer = StringIO("abcd\nefgh\n")buffer.readlines()Result:
['abcd\n', 'efgh\n']
When the file doesn't end in a newline, the last string from
.readlinesdoesn't have a\ncharacter either.>
from io import StringIObuffer = StringIO("abcd\nefgh")buffer.readlines()Result:
['abcd\n', 'efgh']
These details may seem small, but they're very significant in practice. For example, if our code expects trailing
\ncharacters but doesn't get them, it'll probably break. Similarly, if our code doesn't expect trailing\ncharacters but gets them, it'll also probably break..readlinesmodifies the stream position, like.readand.writedo.>
from io import StringIObuffer = StringIO("abcd\nefgh\n")buffer.seek(2)buffer.readlines()Result:
['cd\n', 'efgh\n']
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
# Remember that "\n" is a single character, not two characters!buffer.tell()Result:
10
There's one more thing to know about reading files and
StringIO. We often think of files as sequences of characters. We can.reada file (orStringIO) then iterate over the returned string to see one character at a time.>
from io import StringIObuffer = StringIO("cat")chars = []for char in buffer.read():chars.append(char)charsResult:
In that code example, we explicitly called
.read, then iterated over the string that we got back. However, if we writefor x in some_stringioorfor x in some_file, we'll iterate over the file's lines, like we'd get by callingsome_stringio.readlines().>
from io import StringIObuffer = StringIO("1\n5\n10\n12\n")total = 0for line in buffer:total += int(line)totalResult:
28
The most important thing to remember about
.readlinesis its newline behavior. If a file contains"a\nb", then the lines are["a\n", "b"]!