Artificial Intelligence & Machine Learning
k-Means
Definition
k-Means is a popular unsupervised learning algorithm used for clustering. It aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centroid).
Why It Matters
k-Means is a simple and fast algorithm for finding groups in unlabeled data. It is widely used for tasks like customer segmentation and image compression.
Contextual Example
Given a dataset of customer locations, k-Means can be used to find the optimal locations for 5 new stores (where k=5). The algorithm would group the customers into 5 clusters, and the center of each cluster would be the ideal store location.
Common Misunderstandings
- The "k" (the number of clusters) must be specified beforehand.
- The algorithm is sensitive to the initial placement of the centroids, so it is often run multiple times with different starting points.