How to Create a Histogram in Seaborn

2021-01-20

Intro

In a previous post, we looked at how to create a RugPlot to view the distribution of a continous variable. We now move on to the Histogram. A histogram is one of the most common ways to detail the distribution of a continous variable. The plot makes it simple to see how the count of the values and compare the distribution to a normal distribution. In this article, we will see how to create a histogram with the seaborn library.

Creating a Histogram

It create a histogram in seborn, we can pass a data set to the histplot method and specify the x-axis value. The y-value will be a frequency count of the observations with the same value. Let's see an example using the built in penquins data set. We will view the distribution of flipper length.

import seaborn as sns

penguins = sns.load_dataset("penguins")

sns.histplot(data = penguins,
			 x = "flipper_length_mm")

The plot above show us the count of peguins per a specific flipper length.

Using Histogram Bins

Another common task with histpgrams is to "bin" the counts. Often you will want to have range of frequencies. For example, we could bin the flipper length from 0-10, 10-50, 60-80, etc.

A common bin example is age groups, where we bin people from 0-5, 5-17, 17-34, etc. This is common in polls or studies of different person groups.

The main way to bin is to pass a number to the bin named parameter in seaborn. Seaborn will automatically select the ranges for us.

import seaborn as sns

penguins = sns.load_dataset("penguins")

sns.histplot(data = penguins,
			 x = "flipper_length_mm",
			 bin = 10)
GoTea - KoalaTea