Mini-batch gradient descent (part 4)
Prof. Ng uses, e.g., dW and
db as a shorthand notation for partial derivatives of the loss with respect to these variables, that is, for
∂L∂W and
∂L∂b .
We are essentially always trying to figure out how much the loss changes when we vary some parameter, which is why this notation is practical. Still, during our classes we will probably be more explicit and write out the full expression.