How to Create a ggplot Density Plot in R

05.30.2021

Intro

A density plot allows for us to view the distribution of continous variables. This gives us an idea of the distribution of the variable matches one we recognize or if we want to transform the distribution to match. In this article, we will learn how to create a desntiy plot in ggplot2 and in R.

For those who are in a Hurry

If you don’t have time to read, here is a quick code snippet to use in your project. For others who want details, read on.

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --

## v ggplot2 3.3.3     v purrr   0.3.4
## v tibble  3.1.0     v dplyr   1.0.5
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
data(starwars, package = 'dplyr')

ggplot(starwars, aes(x = height, colour = sex, fill = sex)) +
  geom_density()

unnamed chunk 1 1

Loading the data

For our tutorial, we will use the starwars data set from the dplyr pacakge.

library(tidyverse)
data(starwars, package = 'dplyr')

glimpse(starwars)
## Rows: 87
## Columns: 14
## $ name       <chr> "Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia Or~
## $ height     <int> 172, 167, 96, 202, 150, 178, 165, 97, 183, 182, 188, 180, 2~
## $ mass       <dbl> 77.0, 75.0, 32.0, 136.0, 49.0, 120.0, 75.0, 32.0, 84.0, 77.~
## $ hair_color <chr> "blond", NA, NA, "none", "brown", "brown, grey", "brown", N~
## $ skin_color <chr> "fair", "gold", "white, blue", "white", "light", "light", "~
## $ eye_color  <chr> "blue", "yellow", "red", "yellow", "brown", "blue", "blue",~
## $ birth_year <dbl> 19.0, 112.0, 33.0, 41.9, 19.0, 52.0, 47.0, NA, 24.0, 57.0, ~
## $ sex        <chr> "male", "none", "none", "male", "female", "male", "female",~
## $ gender     <chr> "masculine", "masculine", "masculine", "masculine", "femini~
## $ homeworld  <chr> "Tatooine", "Tatooine", "Naboo", "Tatooine", "Alderaan", "T~
## $ species    <chr> "Human", "Droid", "Droid", "Human", "Human", "Human", "Huma~
## $ films      <list> <"The Empire Strikes Back", "Revenge of the Sith", "Return~
## $ vehicles   <list> <"Snowspeeder", "Imperial Speeder Bike">, <>, <>, <>, "Imp~
## $ starships  <list> <"X-wing", "Imperial shuttle">, <>, <>, "TIE Advanced x1",~

The Basic ggplot Density Plot

To create a box plot in ggplot2, we can use the geom_density method after supplying a continuous variable to the y of our aes, aesthetic. In this example, we will use height from the starwars data set above.

library(ggplot2)


ggplot(starwars, aes(x = height)) +
  geom_density()
## Warning: Removed 6 rows containing non-finite values (stat_density).

unnamed chunk 3 1

Customizing the ggplot Density Plot

We can customize our density plots using some parameters on the geom_boxplot method. For example, we can change the color using the color named parameter. Here is an example.

ggplot(starwars, aes(x = height)) +
  geom_density(color = 4,
               lwd = 1,
               linetype = 1)
## Warning: Removed 6 rows containing non-finite values (stat_density).

unnamed chunk 4 1

ggplot(starwars, aes(x = height)) +
  geom_density(color = 4,
               fill = 4,
               alpha = 0.25)
## Warning: Removed 6 rows containing non-finite values (stat_density).

unnamed chunk 5 1

Chaning the Kernal

Density plots allow you to customize the kernal. We can do this using the kernal parameter in the geom_density method.

ggplot(starwars, aes(x = height)) +
  geom_density(kernel = "rectangular")
## Warning: Removed 6 rows containing non-finite values (stat_density).

unnamed chunk 6 1

Adjusting the ggplot Box Plot Labels

We can adjust the title, x-label, and y-label of our box plot using the labs method. We then pass the title, x and y parameters.

ggplot(starwars, aes(x = height)) +
  geom_density() +
  labs(
    title = "Height of Star Wars Characters",
    x = "X",
    y = "Height"
  )
## Warning: Removed 6 rows containing non-finite values (stat_density).

unnamed chunk 7 1

Group by Color

We can color the separate groups of our density plots by using the fill or colour aesthetic properties. Here is an example of using the fill to assign colors to each factor.

library(ggplot2)


ggplot(starwars, aes(x = height, colour = sex, fill = sex)) +
  geom_density()
## Warning: Removed 6 rows containing non-finite values (stat_density).

## Warning: Groups with fewer than two data points have been dropped.

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning -
## Inf

unnamed chunk 8 1

Facets Groups on a ggplot Density Plot

If we prefer to have separate plots, we can use the facet_ methods in ggplot. For example, here are plots separated by each cut.

library(ggplot2)


ggplot(starwars, aes(x = height, colour = sex, fill = sex)) +
  geom_density() +
  facet_grid(~sex)
## Warning: Removed 6 rows containing non-finite values (stat_density).

## Warning: Groups with fewer than two data points have been dropped.

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning -
## Inf

unnamed chunk 9 1

Limiting X and Y

If we would like to limit the y values of our plots, we can use the ylimit function.

ggplot(starwars, aes(x = height)) +
  geom_density() +
  xlim(125, 200) +
  ylim(0, .15)
## Warning: Removed 25 rows containing non-finite values (stat_density).

unnamed chunk 10 1

Scaling X and Y

We can also scale the y axis using the scale_ function from ggplot. Here are some example of a log10 and sqrt scale of the y axis.

ggplot(starwars, aes(x = height)) +
  geom_density() +
  scale_x_log10() +
  scale_y_sqrt()
## Warning: Removed 6 rows containing non-finite values (stat_density).

unnamed chunk 11 1

Color and Fill Scales

There are many color options in ggplot. We can use scale_ methods like scale_fill_brewer() to have ggplot automatically assign different themes based on our data set.

library(ggplot2)


ggplot(starwars, aes(x = height, colour = sex, fill = sex)) +
  geom_density() +
  scale_fill_brewer()
## Warning: Removed 6 rows containing non-finite values (stat_density).

## Warning: Groups with fewer than two data points have been dropped.

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning -
## Inf

unnamed chunk 12 1

Customizing the Legend of a ggplot Density Plot

When we have groups, ggplot will add a legend to the plot. We can customize the position of this legend using the theme method and the legend.position parameter. Here are example of moving the legend to the top, bottom, and hiding it.

ggplot(starwars, aes(x = height, colour = sex, fill = sex)) +
  geom_density() +
  theme(legend.position="top")
## Warning: Removed 6 rows containing non-finite values (stat_density).

## Warning: Groups with fewer than two data points have been dropped.

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning -
## Inf

unnamed chunk 13 1

ggplot(starwars, aes(x = height, colour = sex, fill = sex)) +
  geom_density() +
  theme(legend.position="bottom")
## Warning: Removed 6 rows containing non-finite values (stat_density).

## Warning: Groups with fewer than two data points have been dropped.

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning -
## Inf

unnamed chunk 14 1

ggplot(starwars, aes(x = height, colour = sex, fill = sex)) +
  geom_density() +
  theme(legend.position="none")
## Warning: Removed 6 rows containing non-finite values (stat_density).

## Warning: Groups with fewer than two data points have been dropped.

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning -
## Inf

unnamed chunk 15 1

Using Themes with a ggplot Density Plot

If we want to use built in styles for the full plot, ggplot provides themes to add to our plot. Here is an example of adding the theme_classic to our plot.

ggplot(starwars, aes(x = height, colour = sex, fill = sex)) +
  geom_density() +
  theme_classic()
## Warning: Removed 6 rows containing non-finite values (stat_density).

## Warning: Groups with fewer than two data points have been dropped.

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning -
## Inf

unnamed chunk 16 1