Kodeclik Logo

Our Programs

Courses

Gifting

Learn More

Schedule

Kodeclik Blog

Splitting a Text File by Lines in Python

In this blogpost we show you how to split a textfile by lines in your Python program. Suppose you have a text file containing months and seasons, like so:

January Winter
February Winter
March Spring
April Spring
May Spring
June Summer
July Summer
August Summer
September Fall
October Fall
November Fall
December Winter

Let us name this file “seasons.txt”. Now we show you how to read this text file and split it into lines inside your Python program for ease of processing.

Python Split Files into Lines

Method 1: Use a for loop to read each line

The easiest way is to open the file using the open() function and read all its contents into a variable. Then you use a for loop to cycle through this variable and print each line, one at a time.

Consider the program:

with open("seasons.txt",'r') as seasonslist:
for x in seasonslist:
  print(x)

This produces the output:

January Winter

February Winter

March Spring

April Spring

May Spring

June Summer

July Summer

August Summer

September Fall

October Fall

November Fall

December Winter

Note that there is an extra newline after each line. This is because, in addition to the newline in the file after each line, the print() function adds its own newline after each invocation. Within the for loop you can do any special processing you desire on the variable.

For instance, you can remove the newlines with the following program:

with open("seasons.txt",'r') as seasonslist:
for x in seasonslist:
  print(x.strip())

The output is:

January Winter
February Winter
March Spring
April Spring
May Spring
June Summer
July Summer
August Summer
September Fall
October Fall
November Fall
December Winter

Method 2: Use a list comprehension

A second way to split a text file line by line is using list comprehension syntax. Consider the program:

with open("seasons.txt",'r') as seasonslist:
  all_lines = [s.strip() for s in seasonslist]

  print(all_lines)

This will output:

['January Winter', 'February Winter', 'March Spring', 'April Spring', 
'May Spring', 'June Summer', 'July Summer', 'August Summer', 
'September Fall', 'October Fall', 'November Fall', 'December Winter']

If you desire pretty printing, you can use another loop, like so:

with open("seasons.txt",'r') as seasonslist:
  all_lines = [s.strip() for s in seasonslist]

for x in all_lines:
  print(x)

This will output as before:

January Winter
February Winter
March Spring
April Spring
May Spring
June Summer
July Summer
August Summer
September Fall
October Fall
November Fall
December Winter

Method 3: Use split() or splitlines()

The split() method helps split a string into lists of fields. Thus the third method for splitting our text file looks like this:

seasonslist = open("seasons.txt",'r')
for x in seasonslist:
  print(x.split())

We read each line as before into the variable “x” and then use split to convert each line into a list. The output is:

['January', 'Winter']
['February', 'Winter']
['March', 'Spring']
['April', 'Spring']
['May', 'Spring']
['June', 'Summer']
['July', 'Summer']
['August', 'Summer']
['September', 'Fall']
['October', 'Fall']
['November', 'Fall']
['December', 'Winter']

The split() method takes an argument which is the (typically) whitespace character that is used to split each line. If instead of whitespace the fields are separated by a pipe (“|”) character then this should be given as an argument to the split() method.

The splitlines() method behaves for the most part like the split() method except in edge cases involving empty strings and terminal line breaks. Its arguments are also different. Thus the code:

seasonslist = open("seasons.txt",'r')
for x in seasonslist:
  print(x.splitlines())

outputs:

['January Winter']
['February Winter']
['March Spring']
['April Spring']
['May Spring']
['June Summer']
['July Summer']
['August Summer']
['September Fall']
['October Fall']
['November Fall']
['December Winter']

Method 4: Use a generator

The final method we will encounter is considered a “lazy” method and is suitable for reading very large files. It splits a text file line by line but doesn’t do it all in one stroke. Instead it provides lines one by one on demand.

First we write a function called “lazy_read”:

def lazy_read(file):
fp = open(file,'r')
while True:
  line = fp.readline()
  if (not line):
    break
  yield line

Note that in this function we use the “yield” statement instead of return. The yield statement suspends the function’s execution (in this case, after returning the first line read) but when it is invoked again it remembers enough state to continue with the execution (to the second line, and so on). In this manner, you do not get overwhelmed with large files and can read them line by line.

Now here is how we use this function:

seasonslist = lazy_read('seasons.txt')
for x in seasonslist:
  print(x.strip())

The output is as before:

January Winter
February Winter
March Spring
April Spring
May Spring
June Summer
July Summer
August Summer
September Fall
October Fall
November Fall
December Winter

You have learnt four different ways to read a text file and split it into lines in Python. Which one is your favorite?

Interested in more things Python? See our blogpost on Python's enumerate() capability. Also if you like Python+math content, see our blogpost on Magic Squares. Finally, master the Python print function!

Want to learn Python with us? Sign up for 1:1 or small group classes.

Kodeclik sidebar newsletter

Join our mailing list

Subscribe to get updates about our classes, camps, coupons, and more.

About

Kodeclik is an online coding academy for kids and teens to learn real world programming. Kids are introduced to coding in a fun and exciting way and are challeged to higher levels with engaging, high quality content.

Copyright @ Kodeclik 2024. All rights reserved.