Artificial Intelligence & Machine Learning
Data Augmentation
Definition
Data augmentation is a technique used to increase the size and diversity of a training dataset by creating modified copies of existing data or newly created synthetic data. It helps to reduce overfitting when training a machine learning model.
Why It Matters
Deep learning models require large amounts of data. When data is limited, augmentation is a crucial technique to artificially expand the dataset, which helps the model generalize better and become more robust.
Contextual Example
To train an image classification model, you could augment your dataset by creating modified versions of your existing images: rotating them slightly, zooming in or out, changing the brightness, or flipping them horizontally.
Common Misunderstandings
- Data augmentation is a form of regularization.
- It is a very common and effective technique, especially in computer vision.