When analyzing a data set, you often would like to compare categorical variables to each other. For example, you may have a list of sales, and you would like to display a count per the number of product types: books, shoes, etc. A bar chart is a good way to display and compare this data. In this article, we will learn how to create a bar chart with ggplot2 in R.
For those with little time, here is a quick snippet of BarPlots. Read on for more details.
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.3 v purrr 0.3.4
## v tibble 3.1.0 v dplyr 1.0.5
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
data(diamonds)
ggplot(diamonds, aes(x = cut)) +
geom_bar()
For our tutorial, we will use the diamonds
data set that comes with
the ggplot
package.
library(tidyverse)
data(diamonds)
glimpse(diamonds)
## Rows: 53,940
## Columns: 10
## $ carat <dbl> 0.23, 0.21, 0.23, 0.29, 0.31, 0.24, 0.24, 0.26, 0.22, 0.23, 0.~
## $ cut <ord> Ideal, Premium, Good, Premium, Good, Very Good, Very Good, Ver~
## $ color <ord> E, E, E, I, J, J, I, H, E, H, J, J, F, J, E, E, I, J, J, J, I,~
## $ clarity <ord> SI2, SI1, VS1, VS2, SI2, VVS2, VVS1, SI1, VS2, VS1, SI1, VS1, ~
## $ depth <dbl> 61.5, 59.8, 56.9, 62.4, 63.3, 62.8, 62.3, 61.9, 65.1, 59.4, 64~
## $ table <dbl> 55, 61, 65, 58, 58, 57, 57, 55, 61, 61, 55, 56, 61, 54, 62, 58~
## $ price <int> 326, 326, 327, 334, 335, 336, 336, 337, 337, 338, 339, 340, 34~
## $ x <dbl> 3.95, 3.89, 4.05, 4.20, 4.34, 3.94, 3.95, 4.07, 3.87, 4.00, 4.~
## $ y <dbl> 3.98, 3.84, 4.07, 4.23, 4.35, 3.96, 3.98, 4.11, 3.78, 4.05, 4.~
## $ z <dbl> 2.43, 2.31, 2.31, 2.63, 2.75, 2.48, 2.47, 2.53, 2.49, 2.39, 2.~
To create a BarPlot in ggplot2, we can use the geom_bar
method after
supplying a continuous variable to the y of our aes
, aesthetic. In
this example, we will use height from the price data set above.
ggplot(diamonds, aes(x = cut)) +
geom_bar()
We can also flip the plot to orient horizontally by using the
coord_flip
method.
ggplot(diamonds, aes(x = cut)) +
geom_bar() +
coord_flip()
We can customize our BarPlots using some parameters on the geom_bar
method. For example, we can change the color using the color
named
parameter. Here is an example.
ggplot(diamonds, aes(x = cut)) +
geom_bar(color = 4,
fill = 4,
alpha = 0.25)
We can adjust the title, x-label, and y-label of our BarPlot using the
labs
method. We then pass the title
, x
and y
parameters.
ggplot(diamonds, aes(x = cut)) +
geom_bar() +
labs(
title = "Cut Distribution",
x = "Cut",
y = "Count"
)
We can color the separate groups of our violin plots by using the fill
or colour
aesthetic properties. Here is an example of using the fill
to assign colors to each factor.
library(ggplot2)
ggplot(diamonds, aes(x = cut, fill = color)) +
geom_bar()
If we prefer to have separate plots, we can use the facet_
methods in
ggplot. For example, here are plots separated by each cut.
library(ggplot2)
ggplot(diamonds, aes(x = cut, fill = color)) +
geom_bar() +
facet_grid(~cut)
If we would like to limit the y values of our plots, we can use the
ylimit
function
ggplot(diamonds, aes(x = cut)) +
geom_bar() +
ylim(0, 15000)
## Warning: Removed 1 rows containing missing values (geom_bar).
We can also scale the y axis using the scale_
function from ggplot.
Here are some example of a log10 and sqrt scale of the y axis.
ggplot(diamonds, aes(x = cut)) +
geom_bar() +
scale_y_log10()
ggplot(diamonds, aes(x = cut)) +
geom_bar() +
scale_y_sqrt()
There are many color options in ggplot. We can use scale_
methods like
scale_fill_brewer()
to have ggplot automatically assign different
themes based on our data set.
library(ggplot2)
ggplot(diamonds, aes(x = cut, fill = color)) +
geom_bar() +
scale_fill_brewer()
When we have groups, ggplot will add a legend to the plot. We can
customize the position of this legend using the theme
method and the
legend.position
parameter. Here are example of moving the legend to
the top, bottom, and hiding it.
ggplot(diamonds, aes(x = cut, fill = color)) +
geom_bar() +
theme(legend.position="top")
ggplot(diamonds, aes(x = cut, fill = color)) +
geom_bar() +
theme(legend.position="bottom")
ggplot(diamonds, aes(x = cut, fill = color)) +
geom_bar() +
theme(legend.position="none")
If we want to use built in styles for the full plot, ggplot provides
themes to add to our plot. Here is an example of adding the
theme_classic
to our plot.
ggplot(diamonds, aes(x = cut, fill = color)) +
geom_bar() +
theme_classic()