How to Train a Logisitic Regression Model on Large Data in Sklearn

03.11.2021

Intro

When you have large amounts of data, you will often need to select a different "solver" for your logistic regression model. The solver is part of the algorithm underneath that uses mathemicatlly optimization techniques. Sklearn allows you to choose models when buidling your model. In this article, we will learn how to train a logistic regression model on large data with Sklearn.

Fitting Logistic Regression to Large Data

To change the solver for your logistic regression model, you simply need to specify the solver paramter when creating an instance of LogisticRegression. If you specify the sag model, this will help you fit and classify on a large dataset.

#
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

iris = datasets.load_iris()
features = iris.data
target = iris.target

scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)

logistic_regression = LogisticRegression(solver="sag")

model = logistic_regression.fit(features_standardized, target)
print(mode.score())