Kodeclik Blog
How to split a list into sublists of given lengths
You might sometimes find a need to split a Python list into sublists of predefined lengths. Consider the following list for instance:
list_of_states = ['New','Jersey','Virginia','New','York','California']
Note that the names of some states have been broken down into individual words and we wish to bring them back together and group them accordingly. In other words, we desire to morph this list into:
[['New', 'Jersey'], ['Virginia'], ['New', 'York'], ['California']]
There are two approaches to do so. Lets explore both of them next!
Method 1: Use the islice() method from the itertools package
In the itertools package there is an islice() method that is used to select parts of iterators and we will use this in our program. First, we will create an iterator for the given list. Then we will pass this iterator and the specified list sizes to islice() to obtain the broken down lists. Finally, we put all these lists together into another list using the list constructor. Here is how that works:
from itertools import islice
list_of_states = ['New','Jersey','Virginia','New','York','California']
split_sizes = [2,1,2,1]
x = iter(list_of_states)
y = [list(islice(x,s)) for s in split_sizes]
print(y)
Note that we import islice from itertools first. The split_sizes list gives the sizes of lists we need to construct (we assume that the user has correctly determined this). The actual breakup of the list into sublists is happening in the variable “y” which uses a comprehension to obtain a list of lists. The output is:
[['New', 'Jersey'], ['Virginia'], ['New', 'York'], ['California']]
as desired.
Method 2: Use the accumulate and zip methods to break down the list into parts
In our second approach we still use the itertools package but a different function (accumulate) and the zip function. Here is how that works:
from itertools import accumulate
list_of_states = ['New','Jersey','Virginia','New','York','California']
split_sizes = [2,1,2,1]
y = [list_of_states[(x-y): x]
for x, y in zip(accumulate(split_sizes), split_sizes)]
print(y)
The output will be, as before:
[['New', 'Jersey'], ['Virginia'], ['New', 'York'], ['California']]
Now, why and how does this work? Let us unpack this program a bit. To understand this, we first need to appreciate what accumulate() does. Let us add some print statements to obtain the idea:
print(split_sizes)
print(list(accumulate(split_sizes)))
The output for the above two lines would be:
[2, 1, 2, 1]
[2, 3, 5, 6]
Note how accumulate has created a running tally of the sizes so that the final value in the accumulate-d list is the size of the original list, i.e., 6. Thus, we can use this information to find the starting points and the sizes of the resulting lists. We first use the zip function to create a pairing. If we were to do:
print(list(zip(accumulate(split_sizes), split_sizes)))
this will output:
[(2, 2), (3, 1), (5, 2), (6, 1)]
This gives us the ending points of the lists and the sizes of the lists in a corresponding manner. This is why we do [(x-y): x] to index into the lists (because x-y is the starting point and x is one more than the ending point). So when we put it all together, we obtain our desired output:
[['New', 'Jersey'], ['Virginia'], ['New', 'York'], ['California']]
As you can see we have learnt two different ways to split a list into sublists in Python. Which one is your favorite?
For more Python content, checkout the math.ceil() and math.floor() functions! Also
learn about the math domain error in Python and how to fix it!
Interested in more things Python? Checkout our post on Python queues. Also see our blogpost on Python's enumerate() capability. Also if you like Python+math content, see our blogpost on Magic Squares. Finally, master the Python print function!
Want to learn Python with us? Sign up for 1:1 or small group classes.