Once you have built a model, if the model is easily interpretable, it is often interesting to learn which of the features are most important. This helps guides some intuition about what values affect the target or the prediction. For example, if you are looking at churn data, it would be nice to see some features of your churned customers (low usage, number of complaints) to see what is the root cause. In this article, we will learn how to find the most important features in a Random Forest Model.
To view the most important features in a model, we use the
feature_importances_ property. This will return a list of features and their importance score. Depending on the model this can mean a few things. In general, the higher tha value, the more important the feature is.
import numpy as np import matplotlib.pyplot as plt from sklearn.ensemble import RandomForestClassifier from sklearn import datasets iris = datasets.load_iris() features = iris.data target = iris.target randomforest = RandomForestClassifier() model = randomforest.fit(features, target) importances = model.feature_importances_ print(importances)