Mini-batch gradient descent (part 4)

Prof. Ng uses, e.g., LaTeX: dWdW  and LaTeX: dbdb  as a shorthand notation for partial derivatives of the loss with respect to these variables, that is, for LaTeX: \frac{\partial L}{\partial W}LW and LaTeX: \frac{\partial L}{\partial b}Lb .

 

We are essentially always trying to figure out how much the loss changes when we vary some parameter, which is why this notation is practical. Still, during our classes we will probably be more explicit and write out the full expression.