When loading data, you often has missing pieces. There are various ways to handle these missing data. A common way is to impute the data or fill in the information. In this article, we will see how to impute data with sklearn.
To imput data, we use the
preprocessing.Imputer() class. Once we have an instance of this class we can all the
fit_transform method on data with missing values and sklearn will return data filled in.
import numpy as np from sklearn import datasets from sklearn import preprocessing ## Load the data iris = datasets.load_iris() X = iris.data ## Mark some as empty X[1:25] = np.nan ## Impute the missing data impute = preprocessing.Imputer() xImputed = impute.fit_transform(X) print(xImputed)