How to use dplyr count in R

06.06.2021

Intro

The count will display the count of unique values for a column in your data set. This helps you quickly view the count of variables in a tabular form. In this article, we will learn how to use the dplyr count function in R.

If you are in a hurry

If you don’t have time to read, here is a quick code snippet for you.

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --

## v ggplot2 3.3.3     v purrr   0.3.4
## v tibble  3.1.0     v dplyr   1.0.5
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
mtcars %>% count(mpg)
##     mpg n
## 1  10.4 2
## 2  13.3 1
## 3  14.3 1
## 4  14.7 1
## 5  15.0 1
## 6  15.2 2
## 7  15.5 1
## 8  15.8 1
## 9  16.4 1
## 10 17.3 1
## 11 17.8 1
## 12 18.1 1
## 13 18.7 1
## 14 19.2 2
## 15 19.7 1
## 16 21.0 2
## 17 21.4 2
## 18 21.5 1
## 19 22.8 2
## 20 24.4 1
## 21 26.0 1
## 22 27.3 1
## 23 30.4 2
## 24 32.4 1
## 25 33.9 1

Loading the Library

We can load the dplyr package directly, but I recommend loading the tidyverse package as we will use some other features in side.

library(tidyverse)

Loading the Dataset

For this tutorial, we will use the mtcars data set the comes with tidyverse. We take a look at this data set below.

data(mtcars)

glimpse(mtcars)
## Rows: 32
## Columns: 11
## $ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,~
## $ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,~
## $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16~
## $ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180~
## $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,~
## $ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.~
## $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18~
## $ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,~
## $ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,~
## $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,~
## $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,~

Basic dplyr Count

To use the basic count function, we can call count and pass the data set and the column name we want to count.

count(mtcars, cyl)
##   cyl  n
## 1   4 11
## 2   6  7
## 3   8 14

We see that we get a data frame with the column and a count of unique variables per value. For example, there are 11 rows with cyl == 4.

Normally, we use the pipe, %>%, operator when working with tidyverse and dplyr. We can send your data set to the count function using this operator.

mtcars %>% count(cyl)
##   cyl  n
## 1   4 11
## 2   6  7
## 3   8 14

Sorting by Count

We can sort by count by using the sort parameter and setting that to TRUE.

mtcars %>% count(cyl, sort = TRUE)
##   cyl  n
## 1   8 14
## 2   4 11
## 3   6  7

Count Multiple Columns

We can sort by multiple columns by passing multiple column names. In this example below, we count the combination of cyl and hp.

mtcars %>% count(cyl, hp, sort = TRUE)
##    cyl  hp n
## 1    6 110 3
## 2    8 180 3
## 3    4  66 2
## 4    6 123 2
## 5    8 150 2
## 6    8 175 2
## 7    8 245 2
## 8    4  52 1
## 9    4  62 1
## 10   4  65 1
## 11   4  91 1
## 12   4  93 1
## 13   4  95 1
## 14   4  97 1
## 15   4 109 1
## 16   4 113 1
## 17   6 105 1
## 18   6 175 1
## 19   8 205 1
## 20   8 215 1
## 21   8 230 1
## 22   8 264 1
## 23   8 335 1

Counting Computed Problems

We can count by a computed columns as well. We could use the mutate function, however, count allows us to add a new column as well. Here we calculate the ration of mpg to hp and sort by that ratio.

mtcars %>% count(mph_by_hp = mpg / hp)
##     mph_by_hp n
## 1  0.04477612 1
## 2  0.04837209 1
## 3  0.05073171 1
## 4  0.05428571 1
## 5  0.05836735 1
## 6  0.05984848 1
## 7  0.06391304 1
## 8  0.08444444 1
## 9  0.09111111 1
## 10 0.09611111 1
## 11 0.10133333 1
## 12 0.10333333 1
## 13 0.10685714 1
## 14 0.10971429 1
## 15 0.11257143 1
## 16 0.14471545 1
## 17 0.15609756 1
## 18 0.17238095 1
## 19 0.19090909 2
## 20 0.19454545 1
## 21 0.19633028 1
## 22 0.22164948 1
## 23 0.24000000 1
## 24 0.24516129 1
## 25 0.26902655 1
## 26 0.28571429 1
## 27 0.39354839 1
## 28 0.41363636 1
## 29 0.49090909 1
## 30 0.52153846 1
## 31 0.58461538 1

Count by Weight

We can performa a weighted count by using the wt parameters. Here we supply hp as a weight for our count.

mtcars %>% count(mpg, wt = hp)
##     mpg   n
## 1  10.4 420
## 2  13.3 245
## 3  14.3 245
## 4  14.7 230
## 5  15.0 335
## 6  15.2 330
## 7  15.5 150
## 8  15.8 264
## 9  16.4 180
## 10 17.3 180
## 11 17.8 123
## 12 18.1 105
## 13 18.7 175
## 14 19.2 298
## 15 19.7 175
## 16 21.0 220
## 17 21.4 219
## 18 21.5  97
## 19 22.8 188
## 20 24.4  62
## 21 26.0  91
## 22 27.3  66
## 23 30.4 165
## 24 32.4  66
## 25 33.9  65

Adding a Count Column

Normally, the count function returns a data frame with the columns to count and the count n column. If we would like to keep all of our columns, we can append the count n column by using the add_count method.

mtcars %>% add_count(mpg)
##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb n
## 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4 2
## 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4 2
## 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1 2
## 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 2
## 5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2 1
## 6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 1
## 7  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4 1
## 8  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2 1
## 9  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2 2
## 10 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4 2
## 11 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4 1
## 12 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3 1
## 13 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3 1
## 14 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3 2
## 15 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4 2
## 16 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4 2
## 17 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4 1
## 18 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1 1
## 19 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2 2
## 20 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1 1
## 21 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1 1
## 22 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2 1
## 23 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2 2
## 24 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4 1
## 25 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2 2
## 26 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1 1
## 27 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2 1
## 28 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2 2
## 29 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4 1
## 30 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6 1
## 31 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8 1
## 32 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2 2