How to Select Multiple Columns in Pandas

2021-01-11

Intro

In a previous article, we learned how to select a single column. Now, we move on to multiple columns. In this article, we will cover how to select multiple columns from a pandas DataFrame. We will use the index operator, the iloc method and the loc method. These will all return a subset DataFrame rather than a series.

Selecting Multiple Columns with the Index Operator

The first method of selecting a columns is with the index operator. This is similar to a single column, however, we pass a list of column names instead of a signle column name.

import pandas as pd

df = pd.DataFrame([
	{
		"person": "James",
		"sales": 1000,
	},
	{
		"person": "Clara",
		"sales": 3000,
	}
])

people = df[['pearson', 'sales']]
print(people)

Selecting columns with loc

The next way to select columns is using the loc method. This method also allows us to select rows.

import pandas as pd

df = pd.DataFrame([
	{
		"person": "James",
		"sales": 1000,
	},
	{
		"person": "Clara",
		"sales": 3000,
	}
])

newDf = df.loc[:, ['pearson', 'sales']]
print(newDf.head())

Notice here that we start with : which is the slice operator for python lists. Basically we are saying, select all the rows and the "pearson" and "sales" column.

Selecting columns with iloc

The final way to select columns is with the iloc method. Instead of using labels, we use column indices. In the example below, we select all rows with columns 1 and 2.

import pandas as pd

df = pd.DataFrame([
	{
		"person": "James",
		"sales": 1000,
	},
	{
		"person": "Clara",
		"sales": 3000,
	}
])

newDf = df.iloc[:, [1, 2]]
print(newDf.head())
GoTea - KoalaTea