How to Standardized Data with Sklearn

02.07.2021

Intro

One common preprocessing step is to standarized or normlize your data set. This fixes values when we compare contious data from different range. For example, say we wanted to compare currencies from two different countries. It would be helpful to standardized the values to relative points. As in, the cost of a house costs 500 points in multiple currencies. In this article we will learn how to standardized features with sklearn.

Standardizing Variables

To standardized variables we can use the scale method from the preprocessing module. We pass in our unscaled data and it will return the process data back.

from sklearn import preprocessing
from sklearn.datasets import load_boston

boston = load_boston()
X, y = boston.data, boston.target

X_scaled = preprocessing.scale(X)