How to Select Columns and Rows from a Data Frame in R

04.15.2021

When working with data frames in R, we have many options for selected data. We can selec the columns and rows by position or name with a few different options. In this article, we will learn how to select columns and rows from a data frame in R.

Selecting By Position

Selecting the nth column

We start by selecting a specific column. Similar to lists, we can use the double bracket [[]] operator to select a column. This will return a vector data type.

ad.names = c("Google", "Facebook", "Twitter")
clicks = c(2000, 4000, 3000)

df = data.frame(name=ad.names, clicks)

df[[2]]

# [1] 2000 4000 3000

If we want to select a column and return a data frame, we can use the single bracket notation.

ad.names = c("Google", "Facebook", "Twitter")
clicks = c(2000, 4000, 3000)

df = data.frame(name=ad.names, clicks)

df[2]

#    clicks
# 1   2000
# 2   4000
# 3   3000

We can also pass a vector of positions to select multiple columns.

ad.names = c("Google", "Facebook", "Twitter")
clicks = c(2000, 4000, 3000)

df = data.frame(name=ad.names, clicks)

df[c(1, 2)]


#     name clicks
# 1   Google   2000
# 2 Facebook   4000
# 3  Twitter   3000

Using matrix style subscription

Since a data frame is a super powered matrix, R also let's us use matrix selection notation. This also allows us to specify rows we want to select. Let's see some examples.

ad.names = c("Google", "Facebook", "Twitter")
clicks = c(2000, 4000, 3000)

df = data.frame(name=ad.names, clicks)

## Select all rows and first column
df[, 1]
# "Google"   "Facebook" "Twitter" 

## Select first two rows and first 2 columns, return
df[1:2, c(1, 2)]

#     name clicks
# 1   Google   2000
# 2 Facebook   4000

Selecting a Column by Name

A very useful feature, is select columns by name. Similar to the above, we can use the double bracket, single bracket, and pass a vector of column names to select. R also has the $ operator which allows us to select a column name like a property.

ad.names = c("Google", "Facebook", "Twitter")
clicks = c(2000, 4000, 3000)

df = data.frame(name=ad.names, clicks)

## Select the clicks column, returns vector
df[["clicks"]]
# [1] 2000 4000 3000

## Select the clicks column with $, returns vector
df$clicks
# [1] 2000 4000 3000

## Select clicks column, returns data frame
df["name"]

#     name
# 1   Google
# 2 Facebook
# 3  Twitter

## Select multiple columns
df[c("name", "clicks")]
#     name clicks
# 1   Google   2000
# 2 Facebook   4000
# 3  Twitter   3000

Matrix style name

Just like with the position, we can also select using matrix style notation.

ad.names = c("Google", "Facebook", "Twitter")
clicks = c(2000, 4000, 3000)

df = data.frame(name=ad.names, clicks)

## Select first two rows and the name column
df[1:2, "name"]
# [1] "Google"   "Facebook"

## Select the first two rows and the first two columns
df[1:2, c("name", "clicks")]
#     name clicks
# 1   Google   2000
# 2 Facebook   4000