Easy k-Means Clustering with MATLAB - MATLAB
Video Player is loading.
Current Time 0:00
Duration 1:50
Loaded: 8.97%
Stream Type LIVE
Remaining Time 1:50
 
1x
  • Chapters
  • descriptions off, selected
  • en (Main), selected
    Video length is 1:50

    Easy k-Means Clustering with MATLAB

    Cluster data using the k-means algorithm in the Live Editor. The Cluster Data Live Editor Task enables you to interactively perform k-means clustering. The task generates MATLAB® code for your live script and returns the resulting cluster indices and the cluster centroid locations to the MATLAB workspace. Determine the optimal number of clusters for your data manually by selecting the number of clusters or automatically by specifying criteria such as gap values, silhouette values, Davies-Bouldin index values, and Calinski-Harabasz index values. Customize the parameters for clustering your data, including the distance metric and the number of replicates. Automatically visualize the clustered data.

    Published: 21 Feb 2022

    Clustering is important for exploring your data and can assist with refining your features. You can now apply k-means, the most popular clustering method, interactively in Matlab with the new Cluster Data Live Editor task. You can choose to do this manually by selecting the number of clusters, or automatically by specifying the criteria, and it automatically visualizes the cluster data.

    In this example, we will use the Human Activity data set based on smartphones sensor signals representing sitting, standing, walking, running and dancing. As you can see, many of the running and dancing data points are overlapping. So instead, we will try to differentiate running and dancing versus walking, sitting and standing. We will now use the Cluster Data Live Editor task to interactively perform k-means clustering for the Human Activity data set.

    Let's start a new session. You can open it by typing Cluster Data or by going to Insert, Task, Stats and Machine Learning, Cluster Data. We're going to cluster our data set with the manual method because we want to separate this into two groups. We select the data set features as Input Data, and the number of clusters to two. Click Run section, and Matlab displays the cluster data in two groups and the cluster means in the scatter plot.

    There are options to optimally select the number of clusters, as well as how the data is clustered. We can see the criteria on which the optimal number of clusters is chosen. And changing the distance parameter from squared Euclidean to city block changes our cluster slightly. We can compare how the clustering worked against our label data. To learn more about clustering data and download example data sets, click on the Help icon in the top right corner of the app.

    View more related videos