A qqplot or quantile-quantile plot helps you determine if the normality assumption of data holds. In this article, we will learn how to plot a qqplot with ggplot2.
If you are short on time, here is the code. Read on for more details.
library(tidyverse)## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.3 v purrr 0.3.4
## v tibble 3.1.0 v dplyr 1.0.5
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()data(starwars, package = 'dplyr')
ggplot(starwars, aes(sample = height, colour = factor(eye_color))) +
stat_qq() +
stat_qq_line()If you are familiar with my tutorial, I like to use the starwars data set, so we wil load that.
library(tidyverse)
data(starwars, package = 'dplyr')To create the basic plot, we can use the stat_qq function to calculate
the quantiles and the stat_qq_line to calculate the line. We attach
both of these calls to our normal qqplot function call. However, we
pass the value we want to the sample param in our aesthetic. Here, we
are plotting height.
ggplot(starwars, aes(sample = height)) +
stat_qq() +
stat_qq_line()## Warning: Removed 6 rows containing non-finite values (stat_qq).
## Warning: Removed 6 rows containing non-finite values (stat_qq_line).We can also split our qqplots across a factor. If we add a factor variable of eye color to the aesthetic, we can accomplish the plot below.
ggplot(starwars, aes(sample = height, colour = factor(eye_color))) +
stat_qq() +
stat_qq_line()## Warning: Removed 6 rows containing non-finite values (stat_qq).
## Warning: Removed 6 rows containing non-finite values (stat_qq_line).We can also do a similar separate by shape instead of color.
ggplot(starwars, aes(sample = height, shape = factor(eye_color))) +
stat_qq() +
stat_qq_line()## Warning: Removed 6 rows containing non-finite values (stat_qq).
## Warning: Removed 6 rows containing non-finite values (stat_qq_line).
## Warning: The shape palette can deal with a maximum of 6 discrete values because
## more than 6 becomes difficult to discriminate; you have 14. Consider
## specifying shapes manually if you must have them.
## Warning: Removed 31 rows containing missing values (geom_point).We can use the theme function with the legend.position paramter to
adjust the legend. Here are a few example of changing the legen position
and removing it.
ggplot(starwars, aes(sample = height, colour = factor(eye_color))) +
stat_qq() +
stat_qq_line() +
theme(legend.position="top")## Warning: Removed 6 rows containing non-finite values (stat_qq).
## Warning: Removed 6 rows containing non-finite values (stat_qq_line).ggplot(starwars, aes(sample = height, colour = factor(eye_color))) +
stat_qq() +
stat_qq_line() +
theme(legend.position="bottom")## Warning: Removed 6 rows containing non-finite values (stat_qq).
## Warning: Removed 6 rows containing non-finite values (stat_qq_line).ggplot(starwars, aes(sample = height, colour = factor(eye_color))) +
stat_qq() +
stat_qq_line() +
theme(legend.position="none")## Warning: Removed 6 rows containing non-finite values (stat_qq).
## Warning: Removed 6 rows containing non-finite values (stat_qq_line).To customize the titles on the plot, we can use the labs function.
This allows use to edit title, x label and y label.
ggplot(starwars, aes(sample = height, colour = factor(eye_color))) +
stat_qq() +
stat_qq_line() +
labs(
title = "QQ Plot",
x = "Theoreticles",
y = "Norm Samples"
)## Warning: Removed 6 rows containing non-finite values (stat_qq).
## Warning: Removed 6 rows containing non-finite values (stat_qq_line).