One of the disadvantages of k-means is that you need to specify the number of clustes, K. Meanshift is one algorithm that can help solve that problem. Meanshift works by find small groups of observations and slowly building them up. There is more details underneath, but this is a good alternative to the standard k-means model. In this article, we will learn how to build a k-means model with MeanShift in Sklearn.
To use meanshift for k-means, we use the MeanShift
class from the cluster
package. Similar to other models in Sklearn, we create an instance of MeanShift
then pass our data to the fit
method.
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import MeanShift
iris = datasets.load_iris()
features = iris.data
scaler = StandardScaler()
features_std = scaler.fit_transform(features)
meanshift = MeanShift()
model = meanshift.fit(features_std)