Kodeclik Blog
How to extract the nth row in a Pandas dataframe
A Pandas dataframe is just like a table or spreadsheet, composed of columns and rows which contain data about some subject matter. Each column denotes typically one attribute (or variable) and each row contains one instance or observation of all variables.
For instance, here is a simple Pandas dataframe where the rows denote months and columns denote various facets of months:
import pandas as pd
months = pd.DataFrame(
{
'name': ["Jan", "Feb", "Mar", "Apr", "May"],
'number': [1, 2, 3, 4, 5],
'days': [31, 28, 31, 30, 31]
})
print(months)
The output will be:
name number days
0 Jan 1 31
1 Feb 2 28
2 Mar 3 31
3 Apr 4 30
4 May 5 31
Note that the days (rows) are numbered from 0 to 1.
If we wish to extract a specific row, we use the iloc() method that takes an index for the rowand returns the full row. For instance, if we wish to extract the March row, we will do:
print(months.iloc[2])
The output is:
name Mar
number 3
days 31
Name: 2, dtype: object
Note that the output is a series. If we wish to extract the output as a dataframe, we will provide the input as a list of integers rather than as a single integer:
print(months.iloc[[2]])
The output is:
name number days
2 Mar 3 31
We can extract multiple rows by providing the list of indices:
print(months.iloc[[2,3]])
The output is:
name number days
2 Mar 3 31
3 Apr 4 30
I.e., we have skipped Jan, Feb, and May because the only indices given were 2 and 3 (i.e., Mar and Apr).
Another way to extract multiple rows is by using the “:” operator, like so:
print(months.iloc[1:4])
Here the notation means that we are interested in rows 1 (the second row) upto (but not including) the row with index 4. So we will get rows with indices 1, 2, and 3. The output is:
name number days
1 Feb 2 28
2 Mar 3 31
3 Apr 4 30
Similarly, if we did:
print(months.iloc[:4])
we will obtain:
name number days
0 Jan 1 31
1 Feb 2 28
2 Mar 3 31
3 Apr 4 30
i.e., all rows upto (but not including) the row indexed 4.
We can also index simultaneously rows and columns with the iloc() method. Here is an example of how that works:
print(months)
print(months.iloc[[0, 2], [1, 2]])
Here we are printing the full dataframe followed by the slice of the dataframe with rows numbered 0 and 2 and columns numbered 1 and 2. We obtain:
name number days
0 Jan 1 31
1 Feb 2 28
2 Mar 3 31
3 Apr 4 30
4 May 5 31
number days
0 1 31
2 3 31
In summary, extracting specific rows from a Pandas dataframe is easy using the iloc() method. Whether you wish just one entry or several at once—any task involving slicing through a pandas DataFrame can be accomplished easily with iloc! With these simple tools at your disposal now there should be nothing stopping you from organizing and extracting info from any dataset quickly and efficiently!
If you liked this blogpost, checkout our blogpost on Pandas daterange. Also learn how to get the column names from a Pandas dataframe. Finally, see how to print the first 10 rows of a dataframe.
Interested in more things Python? Checkout our post on Python queues. Also see our blogpost on Python's enumerate() capability. Also if you like Python+math content, see our blogpost on Magic Squares. Finally, master the Python print function!
Want to learn Python with us? Sign up for 1:1 or small group classes.