 KoalaTea

# How to Check Stationarity of Time Series Data in Python

## Intro

Before modeling a time series data set, we often want to check if the data is stationary. Many models assume stationary time series, and if this assumption is violated, our forcast will not be reliable. In this article, we will learn how to check stationarity of time series data in Python.

## Data

Let's load a data set of monthly milk production. We will load it from the url below. The data consists of monthly intervals and kilograms of milk produced.

import pandas as pd

df.month = pd.to_datetime(df.month)
df = df.set_index('month')
df.head()
milk_prod_per_cow_kg
month
1962-01-01 265.05
1962-02-01 252.45
1962-03-01 288.00
1962-04-01 295.20
1962-05-01 327.15

## Visually Checking

One way to check if the data is stationary is to plot the data. This should always be used in combination with other methods, but some data easily show trends and seasonility. For example the plot below, we can see that there is a trend upward and a definitely seasonal pattern.

df.plot()
<AxesSubplot:xlabel='month'>

## Using the ADF Test

Another way to check if the data is stationary is to use the ADF test. This test will check for a unit root. If there is a unit root, then the data is not stationary. The ADF test is a hypothesis test with the null hypothesis being there is a unit root (non-stationary) and the alternative being there is not a unit root (stationary). We can use the adfuller method from the statsmodels library to check.

from statsmodels.tsa.stattools import adfuller

adfuller(df)
(-1.3038115874221432,
0.6274267086030254,
13,
154,
{'1%': -3.473542528196209,
'5%': -2.880497674144038,
'10%': -2.576878053634677},
870.8296896968735)
# Get the p-value
res
0.6274267086030254
from statsmodels.stats.diagnostic import acorr_ljungbox
acorr_ljungbox(df, lags=, return_df=True)