Once you have built a model, if the model is easily interpretable, it is often interesting to learn which of the features are most important. This helps guides some intuition about what values affect the target or the prediction. For example, if you are looking at churn data, it would be nice to see some features of your churned customers (low usage, number of complaints) to see what is the root cause. In this article, we will learn how to find the most important features in a Random Forest Model.
To view the most important features in a model, we use the feature_importances_
property. This will return a list of features and their importance score. Depending on the model this can mean a few things. In general, the higher tha value, the more important the feature is.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn import datasets
iris = datasets.load_iris()
features = iris.data
target = iris.target
randomforest = RandomForestClassifier()
model = randomforest.fit(features, target)
importances = model.feature_importances_
print(importances)