Artificial Intelligence & Machine Learning
Learning Rate
Definition
The learning rate is a hyperparameter in an optimization algorithm like gradient descent that determines the step size at each iteration while moving toward a minimum of a loss function.
Why It Matters
The learning rate is one of the most important hyperparameters to tune. If it's too small, the model will train very slowly. If it's too large, the model may overshoot the optimal solution and fail to converge.
Contextual Example
In the analogy of descending a mountain, the learning rate is the size of the step you take. A small learning rate is like taking tiny baby steps, while a large learning rate is like taking giant leaps.
Common Misunderstandings
- Finding a good learning rate is a critical part of training deep learning models.
- Techniques like "learning rate schedules" are often used to decrease the learning rate over time during training.