
Explain like I'm five
Imagine you're blindfolded on a hill and want to find the lowest valley. You take small steps downhill, feeling the slope with your feet, and keep adjusting your direction until you can't go any lower. That's gradient descent—it's how AI 'walks' to find the best answer by taking tiny steps in the right direction.

Why it matters
Gradient Descent is the engine behind training most machine learning models, from linear regression to deep neural networks. Without it, AI couldn't learn from data to make accurate predictions or decisions.

Common misconception
Many think gradient descent finds the absolute best answer (global minimum) every time, but it often gets stuck in a shallow valley (local minimum) instead. Also, people confuse the 'gradient' with the 'step size'—the gradient tells direction, while the learning rate controls step size.

Formal definition
Gradient Descent is a first-order iterative optimization algorithm for minimizing a differentiable function. It updates parameters in the opposite direction of the gradient of the loss function with respect to the parameters, scaled by a learning rate, until convergence to a local or global minimum.