Nice quick explanation of hard to grasp concepts
“An interesting thing about gradients is that when you calculate them for a specific point, they give a vector that points in the direction of the biggest increase in the function, or equivalently, in the steepest uphill direction. The opposite direction of the gradient is the biggest decrease of the function, or the steepest downhill direction. This is why gradients are used in the optimization method “Gradient Descent”. The gradient (multiplied by a step size) is subtracted from a point to move it down hill.”
https://blog.demofox.org/2025/08/16/derivatives-gradients-jacobians-and-hessians-oh-my/