The transmute method in dplyr allows you to add new variables, especially computed ones. Unlike mutate, the transmute will remove other columns by default. A common data wrangling task is to create new columns using computations on existing columns. In this article, we will learn how to use the dplyr transmute method.
If you don’t have time to read, here is a quick code snippet for you.
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.3 v purrr 0.3.4
## v tibble 3.1.0 v dplyr 1.0.5
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
mtcars %>% transmute(mpg_avg = mean(mpg))
## mpg_avg
## Mazda RX4 20.09062
## Mazda RX4 Wag 20.09062
## Datsun 710 20.09062
## Hornet 4 Drive 20.09062
## Hornet Sportabout 20.09062
## Valiant 20.09062
## Duster 360 20.09062
## Merc 240D 20.09062
## Merc 230 20.09062
## Merc 280 20.09062
## Merc 280C 20.09062
## Merc 450SE 20.09062
## Merc 450SL 20.09062
## Merc 450SLC 20.09062
## Cadillac Fleetwood 20.09062
## Lincoln Continental 20.09062
## Chrysler Imperial 20.09062
## Fiat 128 20.09062
## Honda Civic 20.09062
## Toyota Corolla 20.09062
## Toyota Corona 20.09062
## Dodge Challenger 20.09062
## AMC Javelin 20.09062
## Camaro Z28 20.09062
## Pontiac Firebird 20.09062
## Fiat X1-9 20.09062
## Porsche 914-2 20.09062
## Lotus Europa 20.09062
## Ford Pantera L 20.09062
## Ferrari Dino 20.09062
## Maserati Bora 20.09062
## Volvo 142E 20.09062
We can load the dplyr package directly, but I recommend loading the
tidyverse
package as we will use some other features in side.
library(tidyverse)
For this tutorial, we will use the mtcars
data set the comes with
tidyverse
. We take a look at this data set below.
data(mtcars)
glimpse(mtcars)
## Rows: 32
## Columns: 11
## $ mpg <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,~
## $ cyl <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,~
## $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16~
## $ hp <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180~
## $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,~
## $ wt <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.~
## $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18~
## $ vs <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,~
## $ am <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,~
## $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,~
## $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,~
The basic use of mutate is to pass our data set and a parameter with the new column we would like. For example, let’s create an avg mpg column.
transmute(mtcars, mpg_hp = mpg / hp)
## mpg_hp
## Mazda RX4 0.19090909
## Mazda RX4 Wag 0.19090909
## Datsun 710 0.24516129
## Hornet 4 Drive 0.19454545
## Hornet Sportabout 0.10685714
## Valiant 0.17238095
## Duster 360 0.05836735
## Merc 240D 0.39354839
## Merc 230 0.24000000
## Merc 280 0.15609756
## Merc 280C 0.14471545
## Merc 450SE 0.09111111
## Merc 450SL 0.09611111
## Merc 450SLC 0.08444444
## Cadillac Fleetwood 0.05073171
## Lincoln Continental 0.04837209
## Chrysler Imperial 0.06391304
## Fiat 128 0.49090909
## Honda Civic 0.58461538
## Toyota Corolla 0.52153846
## Toyota Corona 0.22164948
## Dodge Challenger 0.10333333
## AMC Javelin 0.10133333
## Camaro Z28 0.05428571
## Pontiac Firebird 0.10971429
## Fiat X1-9 0.41363636
## Porsche 914-2 0.28571429
## Lotus Europa 0.26902655
## Ford Pantera L 0.05984848
## Ferrari Dino 0.11257143
## Maserati Bora 0.04477612
## Volvo 142E 0.19633028
We can see at the end, we have a new column added to the end.
When using tidyverse, we often will use the pipe, %>%, operator. With this, we can pass our data using the pip instead. Let’s rewrite the example above.
mtcars %>% transmute(mpg_hp = mpg / hp)
## mpg_hp
## Mazda RX4 0.19090909
## Mazda RX4 Wag 0.19090909
## Datsun 710 0.24516129
## Hornet 4 Drive 0.19454545
## Hornet Sportabout 0.10685714
## Valiant 0.17238095
## Duster 360 0.05836735
## Merc 240D 0.39354839
## Merc 230 0.24000000
## Merc 280 0.15609756
## Merc 280C 0.14471545
## Merc 450SE 0.09111111
## Merc 450SL 0.09611111
## Merc 450SLC 0.08444444
## Cadillac Fleetwood 0.05073171
## Lincoln Continental 0.04837209
## Chrysler Imperial 0.06391304
## Fiat 128 0.49090909
## Honda Civic 0.58461538
## Toyota Corolla 0.52153846
## Toyota Corona 0.22164948
## Dodge Challenger 0.10333333
## AMC Javelin 0.10133333
## Camaro Z28 0.05428571
## Pontiac Firebird 0.10971429
## Fiat X1-9 0.41363636
## Porsche 914-2 0.28571429
## Lotus Europa 0.26902655
## Ford Pantera L 0.05984848
## Ferrari Dino 0.11257143
## Maserati Bora 0.04477612
## Volvo 142E 0.19633028
There are many stats we can use when using the transmute function a list of them are provided in the documentation: https://dplyr.tidyverse.org/reference/mutate.html#useful-mutate-functions .
Let’s take a look at a few examples all in one section.
mtcars %>%
transmute(
mpg2 = mpg * 2,
mpg2_squared = mpg * mpg,
mpg_hp = mpg + hp,
mpg_lead = lead(mpg),
mpg_lag = lag(mpg),
mpg_rank = min_rank(mpg)
)
## mpg2 mpg2_squared mpg_hp mpg_lead mpg_lag mpg_rank
## Mazda RX4 42.0 441.00 131.0 21.0 NA 19
## Mazda RX4 Wag 42.0 441.00 131.0 22.8 21.0 19
## Datsun 710 45.6 519.84 115.8 21.4 21.0 24
## Hornet 4 Drive 42.8 457.96 131.4 18.7 22.8 21
## Hornet Sportabout 37.4 349.69 193.7 18.1 21.4 15
## Valiant 36.2 327.61 123.1 14.3 18.7 14
## Duster 360 28.6 204.49 259.3 24.4 18.1 4
## Merc 240D 48.8 595.36 86.4 22.8 14.3 26
## Merc 230 45.6 519.84 117.8 19.2 24.4 24
## Merc 280 38.4 368.64 142.2 17.8 22.8 16
## Merc 280C 35.6 316.84 140.8 16.4 19.2 13
## Merc 450SE 32.8 268.96 196.4 17.3 17.8 11
## Merc 450SL 34.6 299.29 197.3 15.2 16.4 12
## Merc 450SLC 30.4 231.04 195.2 10.4 17.3 7
## Cadillac Fleetwood 20.8 108.16 215.4 10.4 15.2 1
## Lincoln Continental 20.8 108.16 225.4 14.7 10.4 1
## Chrysler Imperial 29.4 216.09 244.7 32.4 10.4 5
## Fiat 128 64.8 1049.76 98.4 30.4 14.7 31
## Honda Civic 60.8 924.16 82.4 33.9 32.4 29
## Toyota Corolla 67.8 1149.21 98.9 21.5 30.4 32
## Toyota Corona 43.0 462.25 118.5 15.5 33.9 23
## Dodge Challenger 31.0 240.25 165.5 15.2 21.5 9
## AMC Javelin 30.4 231.04 165.2 13.3 15.5 7
## Camaro Z28 26.6 176.89 258.3 19.2 15.2 3
## Pontiac Firebird 38.4 368.64 194.2 27.3 13.3 16
## Fiat X1-9 54.6 745.29 93.3 26.0 19.2 28
## Porsche 914-2 52.0 676.00 117.0 30.4 27.3 27
## Lotus Europa 60.8 924.16 143.4 15.8 26.0 29
## Ford Pantera L 31.6 249.64 279.8 19.7 30.4 10
## Ferrari Dino 39.4 388.09 194.7 15.0 15.8 18
## Maserati Bora 30.0 225.00 350.0 21.4 19.7 6
## Volvo 142E 42.8 457.96 130.4 NA 15.0 21
Many of these variables don’t tell us much, but we can see the many options we can use during the mutate verb.
We can overwrite a variable by passing a parameter with the same name.
mtcars %>%
transmute(
hp = hp * 10
)
## hp
## Mazda RX4 1100
## Mazda RX4 Wag 1100
## Datsun 710 930
## Hornet 4 Drive 1100
## Hornet Sportabout 1750
## Valiant 1050
## Duster 360 2450
## Merc 240D 620
## Merc 230 950
## Merc 280 1230
## Merc 280C 1230
## Merc 450SE 1800
## Merc 450SL 1800
## Merc 450SLC 1800
## Cadillac Fleetwood 2050
## Lincoln Continental 2150
## Chrysler Imperial 2300
## Fiat 128 660
## Honda Civic 520
## Toyota Corolla 650
## Toyota Corona 970
## Dodge Challenger 1500
## AMC Javelin 1500
## Camaro Z28 2450
## Pontiac Firebird 1750
## Fiat X1-9 660
## Porsche 914-2 910
## Lotus Europa 1130
## Ford Pantera L 2640
## Ferrari Dino 1750
## Maserati Bora 3350
## Volvo 142E 1090
We can use select helpers,
https://dplyr.tidyverse.org/reference/group_cols.html?q=select%20helpers,
and apply functions to each variable with mutate. In this example, we
will apply the as.character
transformation to all columns that are not
mpg
.
mtcars %>%
transmute(across(!mpg, as.character))
## cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 6 160 110 3.9 2.62 16.46 0 1 4 4
## Mazda RX4 Wag 6 160 110 3.9 2.875 17.02 0 1 4 4
## Datsun 710 4 108 93 3.85 2.32 18.61 1 1 4 1
## Hornet 4 Drive 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 8 360 175 3.15 3.44 17.02 0 0 3 2
## Valiant 6 225 105 2.76 3.46 20.22 1 0 3 1
## Duster 360 8 360 245 3.21 3.57 15.84 0 0 3 4
## Merc 240D 4 146.7 62 3.69 3.19 20 1 0 4 2
## Merc 230 4 140.8 95 3.92 3.15 22.9 1 0 4 2
## Merc 280 6 167.6 123 3.92 3.44 18.3 1 0 4 4
## Merc 280C 6 167.6 123 3.92 3.44 18.9 1 0 4 4
## Merc 450SE 8 275.8 180 3.07 4.07 17.4 0 0 3 3
## Merc 450SL 8 275.8 180 3.07 3.73 17.6 0 0 3 3
## Merc 450SLC 8 275.8 180 3.07 3.78 18 0 0 3 3
## Cadillac Fleetwood 8 472 205 2.93 5.25 17.98 0 0 3 4
## Lincoln Continental 8 460 215 3 5.424 17.82 0 0 3 4
## Chrysler Imperial 8 440 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 4 78.7 66 4.08 2.2 19.47 1 1 4 1
## Honda Civic 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 4 71.1 65 4.22 1.835 19.9 1 1 4 1
## Toyota Corona 4 120.1 97 3.7 2.465 20.01 1 0 3 1
## Dodge Challenger 8 318 150 2.76 3.52 16.87 0 0 3 2
## AMC Javelin 8 304 150 3.15 3.435 17.3 0 0 3 2
## Camaro Z28 8 350 245 3.73 3.84 15.41 0 0 3 4
## Pontiac Firebird 8 400 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 4 79 66 4.08 1.935 18.9 1 1 4 1
## Porsche 914-2 4 120.3 91 4.43 2.14 16.7 0 1 5 2
## Lotus Europa 4 95.1 113 3.77 1.513 16.9 1 1 5 2
## Ford Pantera L 8 351 264 4.22 3.17 14.5 0 1 5 4
## Ferrari Dino 6 145 175 3.62 2.77 15.5 0 1 5 6
## Maserati Bora 8 301 335 3.54 3.57 14.6 0 1 5 8
## Volvo 142E 4 121 109 4.11 2.78 18.6 1 1 4 2
When grouping data, we can make good use of https://dplyr.tidyverse.org/reference/ranking.html. You can read more about them in the docs, but here is a quick example.
mtcars %>%
select(mpg, cyl) %>%
group_by(cyl) %>%
transmute(rank = min_rank(desc(mpg)))
## # A tibble: 32 x 2
## # Groups: cyl [3]
## cyl rank
## <dbl> <int>
## 1 6 2
## 2 6 2
## 3 4 8
## 4 6 1
## 5 8 2
## 6 6 6
## 7 8 11
## 8 4 7
## 9 4 8
## 10 6 5
## # ... with 22 more rows