Artificial Intelligence & Machine Learning

Principal Component Analysis (PCA)

Definition

Principal Component Analysis (PCA) is an unsupervised learning technique used for dimensionality reduction. It works by transforming a large set of variables into a smaller one that still contains most of the information in the large set.

Why It Matters

PCA is used to simplify complex datasets by reducing the number of features (dimensions) while preserving as much variance as possible. This can help with data visualization, noise reduction, and improving the performance of other ML algorithms.

Contextual Example

A dataset has 100 different features. PCA could be used to find the two "principal components" that capture the most important patterns in the data. You could then create a 2D scatter plot of these two components to visualize the structure of your high-dimensional data.

Common Misunderstandings

  • PCA is not a feature selection method; it creates new features (the principal components) that are combinations of the old ones.
  • It is a very common technique for dimensionality reduction.

Related Terms

Last Updated: December 17, 2025