One of the main issues when fitting a machine learning model is overfitting. This comes from training a model that develops parameters that match the model too well and don't generalize. Often, the reason for this is variance in the data. To counter this, we can use regularization techniques (which also help with other issues). Let's see how to regularize a Logistic Regression model using sklearn.
To add regularization to Logistic Regression, we can use the
LogisticRegressionCV class. We pass in two parameters the
penalty and the
Cs. Penalty allows us to specify which type to use from l1, l2 or elasticnet which correspond to Lasso, Ridge and Enet models. The Cs parameter will create a list of 10 costs to train on (the amount of penalty basically.) The class will return the best model from a the selection or parameters.
from sklearn.linear_model import LogisticRegressionCV from sklearn import datasets iris = datasets.load_iris() features = iris.data target = iris.target # Create a Logistic model with penalty logistic_regression = LogisticRegressionCV( penalty='l2', Cs=10) # Train model model = logistic_regression.fit(features_standardized, target)
If you want more theory, we will have separate articles on the details or you can also look up the model Lasso, Ridge and Enet to get started.