Clusterer Module (API Reference)¶
The scikitplot.cluster
module includes plots built specifically for
scikit-learn clusterer instances e.g. KMeans. You can use your own clusterers,
but these plots assume specific properties shared by scikit-learn estimators.
The specific requirements are documented per function.
-
scikitplot.cluster.
plot_elbow_curve
(clf, X, title=u'Elbow Plot', cluster_ranges=None, n_jobs=1, show_cluster_time=True, ax=None, figsize=None, title_fontsize=u'large', text_fontsize=u'medium')¶ Plots elbow curve of different values of K for KMeans clustering.
Parameters: - clf – Clusterer instance that implements
fit
,``fit_predict``, andscore
methods, and ann_clusters
hyperparameter. e.g.sklearn.cluster.KMeans
instance - X (array-like, shape (n_samples, n_features)) – Data to cluster, where n_samples is the number of samples and n_features is the number of features.
- title (string, optional) – Title of the generated plot. Defaults to “Elbow Plot”
- cluster_ranges (None or
list
of int, optional) – List of n_clusters for which to plot the explained variances. Defaults torange(1, 12, 2)
. - n_jobs (int, optional) – Number of jobs to run in parallel. Defaults to 1.
- show_cluster_time (bool, optional) – Include plot of time it took to cluster for a particular K.
- ax (
matplotlib.axes.Axes
, optional) – The axes upon which to plot the curve. If None, the plot is drawn on a new set of axes. - figsize (2-tuple, optional) – Tuple denoting figure size of the plot
e.g. (6, 6). Defaults to
None
. - title_fontsize (string or int, optional) – Matplotlib-style fontsizes. Use e.g. “small”, “medium”, “large” or integer-values. Defaults to “large”.
- text_fontsize (string or int, optional) – Matplotlib-style fontsizes. Use e.g. “small”, “medium”, “large” or integer-values. Defaults to “medium”.
Returns: - The axes on which the plot was
drawn.
Return type: ax (
matplotlib.axes.Axes
)Example
>>> import scikitplot as skplt >>> kmeans = KMeans(random_state=1) >>> skplt.cluster.plot_elbow_curve(kmeans, cluster_ranges=range(1, 30)) <matplotlib.axes._subplots.AxesSubplot object at 0x7fe967d64490> >>> plt.show()
- clf – Clusterer instance that implements