Bias/variance: remarks (part 1)

In these videos, the terms bias and variance are used in a relaxed sense 

  • bias LaTeX: \approx performance on training data compared to optimal performance, 
  • variance LaTeX: \approx difference between loss on training and validation data,

and for general problems, not only regression problems. The purpose with introducing these concepts is to help you reason about how to adjust your neural network architectures. 

 

People with a background in statistics, may recall that bias and variance have specific technical definitions. Those definitions are not used in these videos but repeated below for completeness:

In regression, if LaTeX: \mu_y = \mathbb{E} \left[  \hat{y}(X)   \Big| Y=y   \right]μy=E[ˆy(X)|Y=y]  then 

  • LaTeX: \text{Bias}(y) =\mu_y - yBias(y)=μyy
  • LaTeX: \text{Variance} (y) = \mathbb{E} \left[ ( \hat{y}(X)- \mu_y)^2 \Big| Y=y \right]Variance(y)=E[(ˆy(X)μy)2|Y=y]

Roughly speaking, bias represents consistent errors (for a given label), whereas variance represents errors due to variations in LaTeX: \hat{y}(X)ˆy(X), which is at least vaguely related to how the terms are used in the upcoming videos.