Gradient descent on fingers
Gradient descent (the gradient descent method) is a numerical method for finding a local minimum or maximum of a function by moving along the gradient; it is one of the core numerical methods in modern optimization.
When people in machine learning talk about “training a model”, they almost always mean the same thing: we want to choose parameters so that the error becomes as small as possible. Which exact model it is — linear regression, logistic regression, a neural network — is less important. More important is that under the hood the same mechanism almost always works: gradient descent.
- Implementation of gradient descent
- Example 1. Parameter trajectory
- Example 2. Effect of learning rate
- Example 3. Plateau and near-zero gradient
- Example 4. Batch and stochastic descent