When working with time series data, we often want to decompose a time series into several components. We usually want to break out the trend, seasonility, and noise. In this article, we will learn how to decompose a time series in Python.
Let's load a data set of monthly milk production. We will load it from the url below. The data consists of monthly intervals and kilograms of milk produced.
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/ourcodingclub/CC-time-series/master/monthly_milk.csv')
df.month = pd.to_datetime(df.month)
df = df.set_index('month')
df.head()
milk_prod_per_cow_kg | |
---|---|
month | |
1962-01-01 | 265.05 |
1962-02-01 | 252.45 |
1962-03-01 | 288.00 |
1962-04-01 | 295.20 |
1962-05-01 | 327.15 |
Let's first plot our time series to see the trend.
df.plot()
<AxesSubplot:xlabel='month'>
To decompose a time series, we can use the seasonal_decompose
from the statsmodels package. To decompose, we pass the variable we want to docompose and the type of model. You have to basic options, additive and multiplicable, here we use multiplicable.
from statsmodels.tsa.seasonal import seasonal_decompose
result = seasonal_decompose(df.milk_prod_per_cow_kg, model = 'multiplicable')
Now the we have the result, we can plot the indiviual pieces of information. For example, below we plot the seasonal and trend.
result.seasonal.plot()
<AxesSubplot:xlabel='month'>
result.trend.plot()
<AxesSubplot:xlabel='month'>
We can also plot everything at once. For example, we can call plot on the result and it will plot each of the decoposed information.
result.plot()