One common preprocessing step is to standarized or normlize your data set. This fixes values when we compare contious data from different range. For example, say we wanted to compare currencies from two different countries. It would be helpful to standardized the values to relative points. As in, the cost of a house costs 500 points in multiple currencies. In this article we will learn how to standardized features with sklearn.
To standardized variables we can use the scale
method from the preprocessing
module. We pass in our unscaled data and it will return the process data back.
from sklearn import preprocessing
from sklearn.datasets import load_boston
boston = load_boston()
X, y = boston.data, boston.target
X_scaled = preprocessing.scale(X)