How to Fit a Regularization Regression Model

02.17.2021

Intro

Sometimes data has highley correlated variables or lots of variation. These are huge issues for simple linear regression and can lead to poor performance. To address these issues, we can use Regularization models which add a shrinkage penalty to our Linear Regression Model. In this article, we will learn how to use Regularization models with Sklearn.

Creating a Regularization Model

To build a Regularization, we can use the Ridge model (there are more like Lasso and Enet). Ridge is a common model to handle correlated variables and variability. To use this model, we create an instance and pass our data to the fit model the same as we do with other models. Note here, we also apply standardization preprocessing, which helps with many models and is pretty much required for Ridge models.

We also specify an alpha value which tells Sklearn "how much" penalty to apply. In practice, we will use multiple alpha values and select the model with the best performance.

from sklearn.linear_model import Ridge
from sklearn.datasets import load_boston
from sklearn.preprocessing import StandardScaler

boston = load_boston()
features = boston.data
target = boston.target

scaler = StandardScaler()
scaledFeats = scaler.fit_transform(features)

## Build the model
regression = Ridge(alpha = 0.5)

model = regression.fit(scaledFeats, target)
print(model.score())