How to use dplyr transmute in R

06.09.2021

Intro

The transmute method in dplyr allows you to add new variables, especially computed ones. Unlike mutate, the transmute will remove other columns by default. A common data wrangling task is to create new columns using computations on existing columns. In this article, we will learn how to use the dplyr transmute method.

If you are in a hurry

If you don’t have time to read, here is a quick code snippet for you.

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --

## v ggplot2 3.3.3     v purrr   0.3.4
## v tibble  3.1.0     v dplyr   1.0.5
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
mtcars %>% transmute(mpg_avg = mean(mpg))
##                      mpg_avg
## Mazda RX4           20.09062
## Mazda RX4 Wag       20.09062
## Datsun 710          20.09062
## Hornet 4 Drive      20.09062
## Hornet Sportabout   20.09062
## Valiant             20.09062
## Duster 360          20.09062
## Merc 240D           20.09062
## Merc 230            20.09062
## Merc 280            20.09062
## Merc 280C           20.09062
## Merc 450SE          20.09062
## Merc 450SL          20.09062
## Merc 450SLC         20.09062
## Cadillac Fleetwood  20.09062
## Lincoln Continental 20.09062
## Chrysler Imperial   20.09062
## Fiat 128            20.09062
## Honda Civic         20.09062
## Toyota Corolla      20.09062
## Toyota Corona       20.09062
## Dodge Challenger    20.09062
## AMC Javelin         20.09062
## Camaro Z28          20.09062
## Pontiac Firebird    20.09062
## Fiat X1-9           20.09062
## Porsche 914-2       20.09062
## Lotus Europa        20.09062
## Ford Pantera L      20.09062
## Ferrari Dino        20.09062
## Maserati Bora       20.09062
## Volvo 142E          20.09062

Loading the Library

We can load the dplyr package directly, but I recommend loading the tidyverse package as we will use some other features in side.

library(tidyverse)

Loading the Dataset

For this tutorial, we will use the mtcars data set the comes with tidyverse. We take a look at this data set below.

data(mtcars)

glimpse(mtcars)
## Rows: 32
## Columns: 11
## $ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,~
## $ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,~
## $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16~
## $ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180~
## $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,~
## $ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.~
## $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18~
## $ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,~
## $ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,~
## $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,~
## $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,~

Basic dplyr Transmute

The basic use of mutate is to pass our data set and a parameter with the new column we would like. For example, let’s create an avg mpg column.

transmute(mtcars, mpg_hp = mpg / hp)
##                         mpg_hp
## Mazda RX4           0.19090909
## Mazda RX4 Wag       0.19090909
## Datsun 710          0.24516129
## Hornet 4 Drive      0.19454545
## Hornet Sportabout   0.10685714
## Valiant             0.17238095
## Duster 360          0.05836735
## Merc 240D           0.39354839
## Merc 230            0.24000000
## Merc 280            0.15609756
## Merc 280C           0.14471545
## Merc 450SE          0.09111111
## Merc 450SL          0.09611111
## Merc 450SLC         0.08444444
## Cadillac Fleetwood  0.05073171
## Lincoln Continental 0.04837209
## Chrysler Imperial   0.06391304
## Fiat 128            0.49090909
## Honda Civic         0.58461538
## Toyota Corolla      0.52153846
## Toyota Corona       0.22164948
## Dodge Challenger    0.10333333
## AMC Javelin         0.10133333
## Camaro Z28          0.05428571
## Pontiac Firebird    0.10971429
## Fiat X1-9           0.41363636
## Porsche 914-2       0.28571429
## Lotus Europa        0.26902655
## Ford Pantera L      0.05984848
## Ferrari Dino        0.11257143
## Maserati Bora       0.04477612
## Volvo 142E          0.19633028

We can see at the end, we have a new column added to the end.

When using tidyverse, we often will use the pipe, %>%, operator. With this, we can pass our data using the pip instead. Let’s rewrite the example above.

mtcars %>% transmute(mpg_hp = mpg / hp)
##                         mpg_hp
## Mazda RX4           0.19090909
## Mazda RX4 Wag       0.19090909
## Datsun 710          0.24516129
## Hornet 4 Drive      0.19454545
## Hornet Sportabout   0.10685714
## Valiant             0.17238095
## Duster 360          0.05836735
## Merc 240D           0.39354839
## Merc 230            0.24000000
## Merc 280            0.15609756
## Merc 280C           0.14471545
## Merc 450SE          0.09111111
## Merc 450SL          0.09611111
## Merc 450SLC         0.08444444
## Cadillac Fleetwood  0.05073171
## Lincoln Continental 0.04837209
## Chrysler Imperial   0.06391304
## Fiat 128            0.49090909
## Honda Civic         0.58461538
## Toyota Corolla      0.52153846
## Toyota Corona       0.22164948
## Dodge Challenger    0.10333333
## AMC Javelin         0.10133333
## Camaro Z28          0.05428571
## Pontiac Firebird    0.10971429
## Fiat X1-9           0.41363636
## Porsche 914-2       0.28571429
## Lotus Europa        0.26902655
## Ford Pantera L      0.05984848
## Ferrari Dino        0.11257143
## Maserati Bora       0.04477612
## Volvo 142E          0.19633028

More Stat Example

There are many stats we can use when using the transmute function a list of them are provided in the documentation: https://dplyr.tidyverse.org/reference/mutate.html#useful-mutate-functions .

Let’s take a look at a few examples all in one section.

mtcars %>%
 transmute(
  mpg2 = mpg * 2,
  mpg2_squared = mpg * mpg,
  mpg_hp = mpg + hp,
  mpg_lead = lead(mpg),
  mpg_lag = lag(mpg),
  mpg_rank = min_rank(mpg)
)
##                     mpg2 mpg2_squared mpg_hp mpg_lead mpg_lag mpg_rank
## Mazda RX4           42.0       441.00  131.0     21.0      NA       19
## Mazda RX4 Wag       42.0       441.00  131.0     22.8    21.0       19
## Datsun 710          45.6       519.84  115.8     21.4    21.0       24
## Hornet 4 Drive      42.8       457.96  131.4     18.7    22.8       21
## Hornet Sportabout   37.4       349.69  193.7     18.1    21.4       15
## Valiant             36.2       327.61  123.1     14.3    18.7       14
## Duster 360          28.6       204.49  259.3     24.4    18.1        4
## Merc 240D           48.8       595.36   86.4     22.8    14.3       26
## Merc 230            45.6       519.84  117.8     19.2    24.4       24
## Merc 280            38.4       368.64  142.2     17.8    22.8       16
## Merc 280C           35.6       316.84  140.8     16.4    19.2       13
## Merc 450SE          32.8       268.96  196.4     17.3    17.8       11
## Merc 450SL          34.6       299.29  197.3     15.2    16.4       12
## Merc 450SLC         30.4       231.04  195.2     10.4    17.3        7
## Cadillac Fleetwood  20.8       108.16  215.4     10.4    15.2        1
## Lincoln Continental 20.8       108.16  225.4     14.7    10.4        1
## Chrysler Imperial   29.4       216.09  244.7     32.4    10.4        5
## Fiat 128            64.8      1049.76   98.4     30.4    14.7       31
## Honda Civic         60.8       924.16   82.4     33.9    32.4       29
## Toyota Corolla      67.8      1149.21   98.9     21.5    30.4       32
## Toyota Corona       43.0       462.25  118.5     15.5    33.9       23
## Dodge Challenger    31.0       240.25  165.5     15.2    21.5        9
## AMC Javelin         30.4       231.04  165.2     13.3    15.5        7
## Camaro Z28          26.6       176.89  258.3     19.2    15.2        3
## Pontiac Firebird    38.4       368.64  194.2     27.3    13.3       16
## Fiat X1-9           54.6       745.29   93.3     26.0    19.2       28
## Porsche 914-2       52.0       676.00  117.0     30.4    27.3       27
## Lotus Europa        60.8       924.16  143.4     15.8    26.0       29
## Ford Pantera L      31.6       249.64  279.8     19.7    30.4       10
## Ferrari Dino        39.4       388.09  194.7     15.0    15.8       18
## Maserati Bora       30.0       225.00  350.0     21.4    19.7        6
## Volvo 142E          42.8       457.96  130.4       NA    15.0       21

Many of these variables don’t tell us much, but we can see the many options we can use during the mutate verb.

Modifying Variables

We can overwrite a variable by passing a parameter with the same name.

mtcars %>%
 transmute(
  hp = hp * 10
)
##                       hp
## Mazda RX4           1100
## Mazda RX4 Wag       1100
## Datsun 710           930
## Hornet 4 Drive      1100
## Hornet Sportabout   1750
## Valiant             1050
## Duster 360          2450
## Merc 240D            620
## Merc 230             950
## Merc 280            1230
## Merc 280C           1230
## Merc 450SE          1800
## Merc 450SL          1800
## Merc 450SLC         1800
## Cadillac Fleetwood  2050
## Lincoln Continental 2150
## Chrysler Imperial   2300
## Fiat 128             660
## Honda Civic          520
## Toyota Corolla       650
## Toyota Corona        970
## Dodge Challenger    1500
## AMC Javelin         1500
## Camaro Z28          2450
## Pontiac Firebird    1750
## Fiat X1-9            660
## Porsche 914-2        910
## Lotus Europa        1130
## Ford Pantera L      2640
## Ferrari Dino        1750
## Maserati Bora       3350
## Volvo 142E          1090

Using Select Helpers

We can use select helpers, https://dplyr.tidyverse.org/reference/group_cols.html?q=select%20helpers, and apply functions to each variable with mutate. In this example, we will apply the as.character transformation to all columns that are not mpg.

mtcars %>%
 transmute(across(!mpg, as.character))
##                     cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4             6   160 110  3.9  2.62 16.46  0  1    4    4
## Mazda RX4 Wag         6   160 110  3.9 2.875 17.02  0  1    4    4
## Datsun 710            4   108  93 3.85  2.32 18.61  1  1    4    1
## Hornet 4 Drive        6   258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout     8   360 175 3.15  3.44 17.02  0  0    3    2
## Valiant               6   225 105 2.76  3.46 20.22  1  0    3    1
## Duster 360            8   360 245 3.21  3.57 15.84  0  0    3    4
## Merc 240D             4 146.7  62 3.69  3.19    20  1  0    4    2
## Merc 230              4 140.8  95 3.92  3.15  22.9  1  0    4    2
## Merc 280              6 167.6 123 3.92  3.44  18.3  1  0    4    4
## Merc 280C             6 167.6 123 3.92  3.44  18.9  1  0    4    4
## Merc 450SE            8 275.8 180 3.07  4.07  17.4  0  0    3    3
## Merc 450SL            8 275.8 180 3.07  3.73  17.6  0  0    3    3
## Merc 450SLC           8 275.8 180 3.07  3.78    18  0  0    3    3
## Cadillac Fleetwood    8   472 205 2.93  5.25 17.98  0  0    3    4
## Lincoln Continental   8   460 215    3 5.424 17.82  0  0    3    4
## Chrysler Imperial     8   440 230 3.23 5.345 17.42  0  0    3    4
## Fiat 128              4  78.7  66 4.08   2.2 19.47  1  1    4    1
## Honda Civic           4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla        4  71.1  65 4.22 1.835  19.9  1  1    4    1
## Toyota Corona         4 120.1  97  3.7 2.465 20.01  1  0    3    1
## Dodge Challenger      8   318 150 2.76  3.52 16.87  0  0    3    2
## AMC Javelin           8   304 150 3.15 3.435  17.3  0  0    3    2
## Camaro Z28            8   350 245 3.73  3.84 15.41  0  0    3    4
## Pontiac Firebird      8   400 175 3.08 3.845 17.05  0  0    3    2
## Fiat X1-9             4    79  66 4.08 1.935  18.9  1  1    4    1
## Porsche 914-2         4 120.3  91 4.43  2.14  16.7  0  1    5    2
## Lotus Europa          4  95.1 113 3.77 1.513  16.9  1  1    5    2
## Ford Pantera L        8   351 264 4.22  3.17  14.5  0  1    5    4
## Ferrari Dino          6   145 175 3.62  2.77  15.5  0  1    5    6
## Maserati Bora         8   301 335 3.54  3.57  14.6  0  1    5    8
## Volvo 142E            4   121 109 4.11  2.78  18.6  1  1    4    2

Window functions

When grouping data, we can make good use of https://dplyr.tidyverse.org/reference/ranking.html. You can read more about them in the docs, but here is a quick example.

mtcars %>%
 select(mpg, cyl) %>%
 group_by(cyl) %>%
 transmute(rank = min_rank(desc(mpg)))
## # A tibble: 32 x 2
## # Groups:   cyl [3]
##      cyl  rank
##    <dbl> <int>
##  1     6     2
##  2     6     2
##  3     4     8
##  4     6     1
##  5     8     2
##  6     6     6
##  7     8    11
##  8     4     7
##  9     4     8
## 10     6     5
## # ... with 22 more rows