 KoalaTea

# How to Perform a Ljung-Box Test in Python

## Intro

When working with time series, we deal with autocorrelation often. In our toolkit, we have a statistical test to check if a time series contains an autocorrelation. That test is Ljung-Box. In this article, we will learn how to perform a Ljung-Box test in Python.

The Ljung-Box test is a hypothesis test that checks if a time series contains an autocorrelation. The null Hypothesis H0 is that the residuals are independently distributed. The alternative hypothesis is that the residuals are not independently distributed and exhibit a serial correlation.

## Data

Let's load a data set of monthly milk production. We will load it from the url below. The data consists of monthly intervals and kilograms of milk produced.

import pandas as pd

df.month = pd.to_datetime(df.month)
df = df.set_index('month')
df.head()
milk_prod_per_cow_kg
month
1962-01-01 265.05
1962-02-01 252.45
1962-03-01 288.00
1962-04-01 295.20
1962-05-01 327.15

## Conducting the Ljung-Box Test

To conduct a Ljung-Box test, we can use the acorr_ljungbox function from the built in statsmodels package. We pass our time series and a lag.

We choose a lag of 1, because we want to see if there is autocorrelation with each lag.

from statsmodels.stats.diagnostic import acorr_ljungbox

acorr_ljungbox(df, lags=, return_df=True)
lb_stat lb_pvalue
1 135.942829 2.053590e-31

Here we see a p-value much smaller than .01, thus we can reject the null hypothesis, indicating the time series does contain an autocorrelation.

Now, we conduct another case with lag 12, because the time series seems to have seasonality every year.

from statsmodels.stats.diagnostic import acorr_ljungbox

acorr_ljungbox(df, lags=, return_df=True)
lb_stat lb_pvalue
12 852.413094 9.438013e-175

Again, we see a p-value much smaller than .01, thus we can reject the null hypothesis, indicating the time series does contain an autocorrelation.