In a previous post, we looked at how to create a RugPlot to view the distribution of a continous variable. We now move on to the Histogram. A histogram is one of the most common ways to detail the distribution of a continous variable. The plot makes it simple to see how the count of the values and compare the distribution to a normal distribution. In this article, we will see how to create a histogram with the seaborn library.
It create a histogram in seborn, we can pass a data set to the
histplot method and specify the x-axis value. The y-value will be a frequency count of the observations with the same value. Let's see an example using the built in penquins data set. We will view the distribution of flipper length.
import seaborn as sns penguins = sns.load_dataset("penguins") sns.histplot(data = penguins, x = "flipper_length_mm")
The plot above show us the count of peguins per a specific flipper length.
Another common task with histpgrams is to "bin" the counts. Often you will want to have range of frequencies. For example, we could bin the flipper length from 0-10, 10-50, 60-80, etc.
A common bin example is age groups, where we bin people from 0-5, 5-17, 17-34, etc. This is common in polls or studies of different person groups.
The main way to bin is to pass a number to the
bin named parameter in seaborn. Seaborn will automatically select the ranges for us.
import seaborn as sns penguins = sns.load_dataset("penguins") sns.histplot(data = penguins, x = "flipper_length_mm", bin = 10)