Supervised Learning

Machine Learning Basics

You should  before moving on to the examination know about the following basics of ML. There are suggestions on some suitable material below (but you will also easily find other good material by Google/Youtube/...)

  • What is supervised, unsupervised and reinforcement learning?
  • The bias-variance trade off.
  • Overfitting and what influences it; how regularization can help.
  • Why does one  split  data into three sets: Training (build model from), validation (check model after training), test data (used only once in a while, not to influence the training too much)
  • Validation by leave-one-out, K-fold cross validation
  • One-hot encoding
  • Cost functions, mean squared error, cross-entropy

Methods

Also make sure you have some knowledge of the following supervised learning methods. You will not have time to study all in detail, but you should understand basic assumptions and ideas:

  • Linear regression
  • Logistic regression
  • K-nearest regression
  • Linear and Quadratic Discriminant Analysis (LDA, QDA)
  • Decision trees and random forests
  • Support Vector Machines
  • Boosting
  • Neural Networks (shallow, we will cover Deep networks later) 

 

Tools

We will use python and the scikit-learn (sklearn) toolbox. We suggest you spend some time (1-2h) viewing the nice collection of programming examples on the scikit-learn home page Links to an external site.

If  you understand most of what happens in the following two Jupyter notebooks you are ready to move on to the examination on music classification. Download these notebooks and data to your VM and run them . (Don't forget to activate the  environment where the tools where installed, e.g. by the command source ~/tensorflowenv/bin/activate)

 

 

Machine Learning Basics

Choose the style of material and ambition level that suits you.

A friendly introduction to Machine Learning Links to an external site. [31 min]

Machine Learning Recipes (#1 in playlist of 10) Links to an external site.

 

Methods

Linear regression using scikit-learn [9min] Links to an external site.

Logistic regression [11min] Links to an external site.(Don't spend time downloading and running the code. You will have to work somewhat to get it to going.)

For more detail, Chapter 4 in the book Introduction to Statistical Learning Links to an external site. covers classification using logistic regression, linear discriminant analysis, quadratic discriminant analysis, K-nearest Neighbors (code are provided in the R language).

 

Decision Trees, Bagging and Random Forests Links to an external site.

Chapter 8.1-8.2 in the book Introduction to Statistical Learning Links to an external site. covers Decision Trees, Bagging and Random Forests in more detail.

Boosting tutorial Links to an external site.

Chapter 8.2.3 in the book Introduction to Statistical Learning Links to an external site. describes Boosting somewhat further. Also see the wiki page on Boosting. Links to an external site.

 

Support Vector Machines [7min] Links to an external site.

Chapter 9 in the book Introduction to Statistical Learning Links to an external site. covers Support Vector Machines in more detail. Some more about  the kernel trick can be found here Links to an external site..

 

What are Neural Networks, part 1 Links to an external site.

(you might want to watch parts 2-4 also)

 

To get some intuition to NN architectures, learning algorithms, and training parameters, spend some time (but not too much...) trying out the Neural Network Playground Links to an external site.

Additional material for the ambitious

These books contain detailed material on all topics above.

 

More links are given on this page with Further Resources