Kodeclik Logo

Our Programs

Courses

Learn More

Schedule

Kodeclik Blog

How to reorder Pandas dataframe columns

Recall that a Pandas dataframe is just like a table or spreadsheet, composed of columns and rows which contain data about some subject matter. Each column denotes typically one attribute (or variable) and each row contains one instance or observation of all variables.

For instance, here is a simple Pandas dataframe where the rows denote months and columns denote various facets of months:

import pandas as pd

months = pd.DataFrame(
  {
    'name': ["Jan", "Feb", "Mar", "Apr", "May"], 
    'number': [1, 2, 3, 4, 5], 
    'days': [31, 28, 31, 30, 31]
  })

print(months)

The output will be:

name  number  days
0  Jan       1    31
1  Feb       2    28
2  Mar       3    31
3  Apr       4    30
4  May       5    31

Now let us suppose you wish to reorder the columns so that number of the month is first, followed by days, followed by the name of the month. Let us learn how to do so!

How to reorder columns in Pandas

Method 1: Pass a list of the reordered columns to the dataframe to create a new dataframe

The easiest way to reorder columns in Pandas is to create a list with the desired order and pass this as an argument to the dataframe, like so:

import pandas as pd

months = pd.DataFrame(
  {
    'name': ["Jan", "Feb", "Mar", "Apr", "May"], 
    'number': [1, 2, 3, 4, 5], 
    'days': [31, 28, 31, 30, 31]
  })

print(months)
                                        
neworder = ['number', 'days', 'name']
newmonths = months[neworder]
print(newmonths)

Here we have created neworder, which is a list of attributes in the desired order. We then pass this desired new order, after which Pandas alters the data in the new format by referencing the columns. The output is:

name  number  days
0  Jan       1    31
1  Feb       2    28
2  Mar       3    31
3  Apr       4    30
4  May       5    31
    number  days name
0       1    31  Jan
1       2    28  Feb
2       3    31  Mar
3       4    30  Apr
4       5    31  May

Note that with this approach you can create a subset of columns as well and use them to create a dataframe that retains only some of the columns desired:

import pandas as pd

months = pd.DataFrame(
  {
    'name': ["Jan", "Feb", "Mar", "Apr", "May"], 
    'number': [1, 2, 3, 4, 5], 
    'days': [31, 28, 31, 30, 31]
  })

print(months)
                                        
neworder = ['number', 'name']
newmonths = months[neworder]
print(newmonths)

The output of the above code will be:

name  number  days
0  Jan       1    31
1  Feb       2    28
2  Mar       3    31
3  Apr       4    30
4  May       5    31
    number name
0       1  Jan
1       2  Feb
2       3  Mar
3       4  Apr
4       5  May

Method 2: Use the iloc() method

In the second approach, we use the iloc() method by passing a list of integers, representing the the columns in the order we wish to place them in the new dataframe:

import pandas as pd

months = pd.DataFrame({
  'name': ["Jan", "Feb", "Mar", "Apr", "May"],
  'number': [1, 2, 3, 4, 5],
  'days': [31, 28, 31, 30, 31]
})

print(months)

newmonths = months.iloc[:,[1,2,0]]
print(newmonths)

Here note that the iloc() method takes two arguments (lists) as indices. The first list corresponds to the set of rows to be retained (and possibly re-ordered). This is set to be “:” meaning all rows are retained with no reorderings. The second argument is the more interesting for our purposes. This states that we wish to retain column 1 (the second column) first, followed by column 2 (the third column), followed by column 0 (the first column), or in other words, ‘number’ followed by ‘days’, followed by ‘name’. The output is:

name  number  days
0  Jan       1    31
1  Feb       2    28
2  Mar       3    31
3  Apr       4    30
4  May       5    31
    number  days name
0       1    31  Jan
1       2    28  Feb
2       3    31  Mar
3       4    30  Apr
4       5    31  May

as expected.

Method 3: Use the loc() method

The loc() method is quite similar to the iloc() method but instead of using integers to represent new column positions, it uses the column labels. Thus the above code is rewritten as:

import pandas as pd

months = pd.DataFrame({
  'name': ["Jan", "Feb", "Mar", "Apr", "May"],
  'number': [1, 2, 3, 4, 5],
  'days': [31, 28, 31, 30, 31]
})

print(months)

newmonths = months.loc[:,['number','days','name']]
print(newmonths)

The output is as before:

name  number  days
0  Jan       1    31
1  Feb       2    28
2  Mar       3    31
3  Apr       4    30
4  May       5    31
    number  days name
0       1    31  Jan
1       2    28  Feb
2       3    31  Mar
3       4    30  Apr
4       5    31  May

In conclusion, we have discussed three different methods for reordering columns in Pandas. Reordering columns in pandas can be a critical part of cleaning and preparing data. It can help arrange data to ease analysis, increase readability, or help perform specific tasks. The methods shown here are useful programmatically to rearrange columns, e.g., when one function returns it in one order and another function expects it in a different order.

Which of the methods we have discussed is your favorite?

If you liked this blogpost, checkout our post on extracting the nth row of a Pandas dataframe.

Interested in more things Python? Checkout our post on Python queues. Also see our blogpost on Python's enumerate() capability. Also if you like Python+math content, see our blogpost on Magic Squares. Finally, master the Python print function!

Want to learn Python with us? Sign up for 1:1 or small group classes.

Kodeclik sidebar newsletter

Join our mailing list

Subscribe to get updates about our classes, camps, coupons, and more.

About

Kodeclik is an online coding academy for kids and teens to learn real world programming. Kids are introduced to coding in a fun and exciting way and are challeged to higher levels with engaging, high quality content.

Copyright @ Kodeclik 2024. All rights reserved.