Kodeclik Blog
How to reorder Pandas dataframe columns
Recall that a Pandas dataframe is just like a table or spreadsheet, composed of columns and rows which contain data about some subject matter. Each column denotes typically one attribute (or variable) and each row contains one instance or observation of all variables.
For instance, here is a simple Pandas dataframe where the rows denote months and columns denote various facets of months:
import pandas as pd
months = pd.DataFrame(
{
'name': ["Jan", "Feb", "Mar", "Apr", "May"],
'number': [1, 2, 3, 4, 5],
'days': [31, 28, 31, 30, 31]
})
print(months)
The output will be:
name number days
0 Jan 1 31
1 Feb 2 28
2 Mar 3 31
3 Apr 4 30
4 May 5 31
Now let us suppose you wish to reorder the columns so that number of the month is first, followed by days, followed by the name of the month. Let us learn how to do so!
Method 1: Pass a list of the reordered columns to the dataframe to create a new dataframe
The easiest way to reorder columns in Pandas is to create a list with the desired order and pass this as an argument to the dataframe, like so:
import pandas as pd
months = pd.DataFrame(
{
'name': ["Jan", "Feb", "Mar", "Apr", "May"],
'number': [1, 2, 3, 4, 5],
'days': [31, 28, 31, 30, 31]
})
print(months)
neworder = ['number', 'days', 'name']
newmonths = months[neworder]
print(newmonths)
Here we have created neworder, which is a list of attributes in the desired order. We then pass this desired new order, after which Pandas alters the data in the new format by referencing the columns. The output is:
name number days
0 Jan 1 31
1 Feb 2 28
2 Mar 3 31
3 Apr 4 30
4 May 5 31
number days name
0 1 31 Jan
1 2 28 Feb
2 3 31 Mar
3 4 30 Apr
4 5 31 May
Note that with this approach you can create a subset of columns as well and use them to create a dataframe that retains only some of the columns desired:
import pandas as pd
months = pd.DataFrame(
{
'name': ["Jan", "Feb", "Mar", "Apr", "May"],
'number': [1, 2, 3, 4, 5],
'days': [31, 28, 31, 30, 31]
})
print(months)
neworder = ['number', 'name']
newmonths = months[neworder]
print(newmonths)
The output of the above code will be:
name number days
0 Jan 1 31
1 Feb 2 28
2 Mar 3 31
3 Apr 4 30
4 May 5 31
number name
0 1 Jan
1 2 Feb
2 3 Mar
3 4 Apr
4 5 May
Method 2: Use the iloc() method
In the second approach, we use the iloc() method by passing a list of integers, representing the the columns in the order we wish to place them in the new dataframe:
import pandas as pd
months = pd.DataFrame({
'name': ["Jan", "Feb", "Mar", "Apr", "May"],
'number': [1, 2, 3, 4, 5],
'days': [31, 28, 31, 30, 31]
})
print(months)
newmonths = months.iloc[:,[1,2,0]]
print(newmonths)
Here note that the iloc() method takes two arguments (lists) as indices. The first list corresponds to the set of rows to be retained (and possibly re-ordered). This is set to be “:” meaning all rows are retained with no reorderings. The second argument is the more interesting for our purposes. This states that we wish to retain column 1 (the second column) first, followed by column 2 (the third column), followed by column 0 (the first column), or in other words, ‘number’ followed by ‘days’, followed by ‘name’. The output is:
name number days
0 Jan 1 31
1 Feb 2 28
2 Mar 3 31
3 Apr 4 30
4 May 5 31
number days name
0 1 31 Jan
1 2 28 Feb
2 3 31 Mar
3 4 30 Apr
4 5 31 May
as expected.
Method 3: Use the loc() method
The loc() method is quite similar to the iloc() method but instead of using integers to represent new column positions, it uses the column labels. Thus the above code is rewritten as:
import pandas as pd
months = pd.DataFrame({
'name': ["Jan", "Feb", "Mar", "Apr", "May"],
'number': [1, 2, 3, 4, 5],
'days': [31, 28, 31, 30, 31]
})
print(months)
newmonths = months.loc[:,['number','days','name']]
print(newmonths)
The output is as before:
name number days
0 Jan 1 31
1 Feb 2 28
2 Mar 3 31
3 Apr 4 30
4 May 5 31
number days name
0 1 31 Jan
1 2 28 Feb
2 3 31 Mar
3 4 30 Apr
4 5 31 May
In conclusion, we have discussed three different methods for reordering columns in Pandas. Reordering columns in pandas can be a critical part of cleaning and preparing data. It can help arrange data to ease analysis, increase readability, or help perform specific tasks. The methods shown here are useful programmatically to rearrange columns, e.g., when one function returns it in one order and another function expects it in a different order.
Which of the methods we have discussed is your favorite?
If you liked this blogpost, checkout our post on extracting the nth row of a Pandas dataframe.
Interested in more things Python? Checkout our post on Python queues. Also see our blogpost on Python's enumerate() capability. Also if you like Python+math content, see our blogpost on Magic Squares. Finally, master the Python print function!
Want to learn Python with us? Sign up for 1:1 or small group classes.