How to use dplyr select in R

06.11.2021

Intro

The select method let’s you easily select columns from your data set. There are many helpful operators and select helpers to get what you need. In this article, we will learn how to use the dplyr select method.

If you are in a hurry

If you don’t have time to read, here is a quick code snippet for you.

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --

## v ggplot2 3.3.3     v purrr   0.3.4
## v tibble  3.1.0     v dplyr   1.0.5
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
mtcars %>% select(mpg, cyl, hp)
##                      mpg cyl  hp
## Mazda RX4           21.0   6 110
## Mazda RX4 Wag       21.0   6 110
## Datsun 710          22.8   4  93
## Hornet 4 Drive      21.4   6 110
## Hornet Sportabout   18.7   8 175
## Valiant             18.1   6 105
## Duster 360          14.3   8 245
## Merc 240D           24.4   4  62
## Merc 230            22.8   4  95
## Merc 280            19.2   6 123
## Merc 280C           17.8   6 123
## Merc 450SE          16.4   8 180
## Merc 450SL          17.3   8 180
## Merc 450SLC         15.2   8 180
## Cadillac Fleetwood  10.4   8 205
## Lincoln Continental 10.4   8 215
## Chrysler Imperial   14.7   8 230
## Fiat 128            32.4   4  66
## Honda Civic         30.4   4  52
## Toyota Corolla      33.9   4  65
## Toyota Corona       21.5   4  97
## Dodge Challenger    15.5   8 150
## AMC Javelin         15.2   8 150
## Camaro Z28          13.3   8 245
## Pontiac Firebird    19.2   8 175
## Fiat X1-9           27.3   4  66
## Porsche 914-2       26.0   4  91
## Lotus Europa        30.4   4 113
## Ford Pantera L      15.8   8 264
## Ferrari Dino        19.7   6 175
## Maserati Bora       15.0   8 335
## Volvo 142E          21.4   4 109

Loading the Library

We can load the dplyr package directly, but I recommend loading the tidyverse package as we will use some other features in side.

library(tidyverse)

Loading the Dataset

For this tutorial, we will use the mtcars data set the comes with tidyverse. We take a look at this data set below.

data(mtcars)

glimpse(mtcars)
## Rows: 32
## Columns: 11
## $ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,~
## $ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,~
## $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16~
## $ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180~
## $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,~
## $ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.~
## $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18~
## $ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,~
## $ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,~
## $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,~
## $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,~

Basic dplyr Select

To use the select method, we can pass our data set and the column we would like to select as parameters. This will return a new data frame with just that column.

select(mtcars, mpg)
##                      mpg
## Mazda RX4           21.0
## Mazda RX4 Wag       21.0
## Datsun 710          22.8
## Hornet 4 Drive      21.4
## Hornet Sportabout   18.7
## Valiant             18.1
## Duster 360          14.3
## Merc 240D           24.4
## Merc 230            22.8
## Merc 280            19.2
## Merc 280C           17.8
## Merc 450SE          16.4
## Merc 450SL          17.3
## Merc 450SLC         15.2
## Cadillac Fleetwood  10.4
## Lincoln Continental 10.4
## Chrysler Imperial   14.7
## Fiat 128            32.4
## Honda Civic         30.4
## Toyota Corolla      33.9
## Toyota Corona       21.5
## Dodge Challenger    15.5
## AMC Javelin         15.2
## Camaro Z28          13.3
## Pontiac Firebird    19.2
## Fiat X1-9           27.3
## Porsche 914-2       26.0
## Lotus Europa        30.4
## Ford Pantera L      15.8
## Ferrari Dino        19.7
## Maserati Bora       15.0
## Volvo 142E          21.4

When working with dplyr and the tidyverse, we often use the pipe, %>% operator. With this, we can send the data set to our method to use. Here is a rewrite of the code above.

mtcars %>% select(mpg)
##                      mpg
## Mazda RX4           21.0
## Mazda RX4 Wag       21.0
## Datsun 710          22.8
## Hornet 4 Drive      21.4
## Hornet Sportabout   18.7
## Valiant             18.1
## Duster 360          14.3
## Merc 240D           24.4
## Merc 230            22.8
## Merc 280            19.2
## Merc 280C           17.8
## Merc 450SE          16.4
## Merc 450SL          17.3
## Merc 450SLC         15.2
## Cadillac Fleetwood  10.4
## Lincoln Continental 10.4
## Chrysler Imperial   14.7
## Fiat 128            32.4
## Honda Civic         30.4
## Toyota Corolla      33.9
## Toyota Corona       21.5
## Dodge Challenger    15.5
## AMC Javelin         15.2
## Camaro Z28          13.3
## Pontiac Firebird    19.2
## Fiat X1-9           27.3
## Porsche 914-2       26.0
## Lotus Europa        30.4
## Ford Pantera L      15.8
## Ferrari Dino        19.7
## Maserati Bora       15.0
## Volvo 142E          21.4

Multiple Select

To select multiple columns, we can pas the column names as parameters and a new data frame will return with all the columns.

mtcars %>% select(mpg, hp, wt)
##                      mpg  hp    wt
## Mazda RX4           21.0 110 2.620
## Mazda RX4 Wag       21.0 110 2.875
## Datsun 710          22.8  93 2.320
## Hornet 4 Drive      21.4 110 3.215
## Hornet Sportabout   18.7 175 3.440
## Valiant             18.1 105 3.460
## Duster 360          14.3 245 3.570
## Merc 240D           24.4  62 3.190
## Merc 230            22.8  95 3.150
## Merc 280            19.2 123 3.440
## Merc 280C           17.8 123 3.440
## Merc 450SE          16.4 180 4.070
## Merc 450SL          17.3 180 3.730
## Merc 450SLC         15.2 180 3.780
## Cadillac Fleetwood  10.4 205 5.250
## Lincoln Continental 10.4 215 5.424
## Chrysler Imperial   14.7 230 5.345
## Fiat 128            32.4  66 2.200
## Honda Civic         30.4  52 1.615
## Toyota Corolla      33.9  65 1.835
## Toyota Corona       21.5  97 2.465
## Dodge Challenger    15.5 150 3.520
## AMC Javelin         15.2 150 3.435
## Camaro Z28          13.3 245 3.840
## Pontiac Firebird    19.2 175 3.845
## Fiat X1-9           27.3  66 1.935
## Porsche 914-2       26.0  91 2.140
## Lotus Europa        30.4 113 1.513
## Ford Pantera L      15.8 264 3.170
## Ferrari Dino        19.7 175 2.770
## Maserati Bora       15.0 335 3.570
## Volvo 142E          21.4 109 2.780

Select Operators

The select method can use a few operators to help us select more columns. For example, we can use the : operator to select a range of columns. In this example, select all columns between mpg and wt.

mtcars %>% select(mpg:wt)
##                      mpg cyl  disp  hp drat    wt
## Mazda RX4           21.0   6 160.0 110 3.90 2.620
## Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875
## Datsun 710          22.8   4 108.0  93 3.85 2.320
## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215
## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440
## Valiant             18.1   6 225.0 105 2.76 3.460
## Duster 360          14.3   8 360.0 245 3.21 3.570
## Merc 240D           24.4   4 146.7  62 3.69 3.190
## Merc 230            22.8   4 140.8  95 3.92 3.150
## Merc 280            19.2   6 167.6 123 3.92 3.440
## Merc 280C           17.8   6 167.6 123 3.92 3.440
## Merc 450SE          16.4   8 275.8 180 3.07 4.070
## Merc 450SL          17.3   8 275.8 180 3.07 3.730
## Merc 450SLC         15.2   8 275.8 180 3.07 3.780
## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250
## Lincoln Continental 10.4   8 460.0 215 3.00 5.424
## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345
## Fiat 128            32.4   4  78.7  66 4.08 2.200
## Honda Civic         30.4   4  75.7  52 4.93 1.615
## Toyota Corolla      33.9   4  71.1  65 4.22 1.835
## Toyota Corona       21.5   4 120.1  97 3.70 2.465
## Dodge Challenger    15.5   8 318.0 150 2.76 3.520
## AMC Javelin         15.2   8 304.0 150 3.15 3.435
## Camaro Z28          13.3   8 350.0 245 3.73 3.840
## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845
## Fiat X1-9           27.3   4  79.0  66 4.08 1.935
## Porsche 914-2       26.0   4 120.3  91 4.43 2.140
## Lotus Europa        30.4   4  95.1 113 3.77 1.513
## Ford Pantera L      15.8   8 351.0 264 4.22 3.170
## Ferrari Dino        19.7   6 145.0 175 3.62 2.770
## Maserati Bora       15.0   8 301.0 335 3.54 3.570
## Volvo 142E          21.4   4 121.0 109 4.11 2.780

We can do the reverse using the ! to select all columns that are not in the range of mpg to wt.

mtcars %>% select(!(mpg:wt))
##                      qsec vs am gear carb
## Mazda RX4           16.46  0  1    4    4
## Mazda RX4 Wag       17.02  0  1    4    4
## Datsun 710          18.61  1  1    4    1
## Hornet 4 Drive      19.44  1  0    3    1
## Hornet Sportabout   17.02  0  0    3    2
## Valiant             20.22  1  0    3    1
## Duster 360          15.84  0  0    3    4
## Merc 240D           20.00  1  0    4    2
## Merc 230            22.90  1  0    4    2
## Merc 280            18.30  1  0    4    4
## Merc 280C           18.90  1  0    4    4
## Merc 450SE          17.40  0  0    3    3
## Merc 450SL          17.60  0  0    3    3
## Merc 450SLC         18.00  0  0    3    3
## Cadillac Fleetwood  17.98  0  0    3    4
## Lincoln Continental 17.82  0  0    3    4
## Chrysler Imperial   17.42  0  0    3    4
## Fiat 128            19.47  1  1    4    1
## Honda Civic         18.52  1  1    4    2
## Toyota Corolla      19.90  1  1    4    1
## Toyota Corona       20.01  1  0    3    1
## Dodge Challenger    16.87  0  0    3    2
## AMC Javelin         17.30  0  0    3    2
## Camaro Z28          15.41  0  0    3    4
## Pontiac Firebird    17.05  0  0    3    2
## Fiat X1-9           18.90  1  1    4    1
## Porsche 914-2       16.70  0  1    5    2
## Lotus Europa        16.90  1  1    5    2
## Ford Pantera L      14.50  0  1    5    4
## Ferrari Dino        15.50  0  1    5    6
## Maserati Bora       14.60  0  1    5    8
## Volvo 142E          18.60  1  1    4    2

Select Helpers

We can use select helpers, https://tidyselect.r-lib.org/reference/select_helpers.html, to query for columns. In this example, we select all columns that start with the letter “c”.

mtcars %>% select(starts_with("c"))
##                     cyl carb
## Mazda RX4             6    4
## Mazda RX4 Wag         6    4
## Datsun 710            4    1
## Hornet 4 Drive        6    1
## Hornet Sportabout     8    2
## Valiant               6    1
## Duster 360            8    4
## Merc 240D             4    2
## Merc 230              4    2
## Merc 280              6    4
## Merc 280C             6    4
## Merc 450SE            8    3
## Merc 450SL            8    3
## Merc 450SLC           8    3
## Cadillac Fleetwood    8    4
## Lincoln Continental   8    4
## Chrysler Imperial     8    4
## Fiat 128              4    1
## Honda Civic           4    2
## Toyota Corolla        4    1
## Toyota Corona         4    1
## Dodge Challenger      8    2
## AMC Javelin           8    2
## Camaro Z28            8    4
## Pontiac Firebird      8    2
## Fiat X1-9             4    1
## Porsche 914-2         4    2
## Lotus Europa          4    2
## Ford Pantera L        8    4
## Ferrari Dino          6    6
## Maserati Bora         8    8
## Volvo 142E            4    2

We can combine the select helpers with another operator, |, to select all columns that start with a “c” or end with a “m”.

mtcars %>% select(starts_with("c") | ends_with("m"))
##                     cyl carb am
## Mazda RX4             6    4  1
## Mazda RX4 Wag         6    4  1
## Datsun 710            4    1  1
## Hornet 4 Drive        6    1  0
## Hornet Sportabout     8    2  0
## Valiant               6    1  0
## Duster 360            8    4  0
## Merc 240D             4    2  0
## Merc 230              4    2  0
## Merc 280              6    4  0
## Merc 280C             6    4  0
## Merc 450SE            8    3  0
## Merc 450SL            8    3  0
## Merc 450SLC           8    3  0
## Cadillac Fleetwood    8    4  0
## Lincoln Continental   8    4  0
## Chrysler Imperial     8    4  0
## Fiat 128              4    1  1
## Honda Civic           4    2  1
## Toyota Corolla        4    1  1
## Toyota Corona         4    1  0
## Dodge Challenger      8    2  0
## AMC Javelin           8    2  0
## Camaro Z28            8    4  0
## Pontiac Firebird      8    2  0
## Fiat X1-9             4    1  1
## Porsche 914-2         4    2  1
## Lotus Europa          4    2  1
## Ford Pantera L        8    4  1
## Ferrari Dino          6    6  1
## Maserati Bora         8    8  1
## Volvo 142E            4    2  1