How to Use PCA in Sklearn

02.13.2021

Intro

PCA is a common preprocessing technique used with many machine learning algorithms. PCA will reduce and combine many of your predictors (x variables) into groups (linear combinations). This makes it harder for the final model to be read, but has the benefit of reducing the number of variables needed to be fit. In this article, we will see how to use PCA in Sklearn.

Using PCA

To use PCA, we create a PCA instance using the class from the decomposition module. Then, we use the fit_transform method and pass in our X matrix. This returns a new matrix with a linear combination (groups) of our variables.

from sklearn import datasets
from sklearn import decomposition

iris = datasets.load_iris()
X = iris.data
y = iris.target

pca = decomposition.PCA()
xPca = pca.fit_transform(X)

print(xPca)