Semi-Supervised Learning
Definition
Semi-supervised learning is a machine learning approach that combines a small amount of labeled data with a large amount of unlabeled data during training. It falls between unsupervised learning (with no labeled data) and supervised learning (with only labeled data).
Why It Matters
Labeling data can be very expensive and time-consuming. Semi-supervised learning provides a way to leverage large amounts of easily available unlabeled data to improve the performance of a model, even when you only have a few labels.
Contextual Example
To train an image classifier, you might have a few thousand labeled images and millions of unlabeled images. A semi-supervised approach could first learn about the general structure of the images from the unlabeled data, and then use the labeled data to learn the specific classes.
Common Misunderstandings
- It is a useful technique when data labeling is a bottleneck.
- It tries to get the best of both supervised and unsupervised learning.