Course syllabus

https://www.kth.se/student/kurser/kurs/SF2930?l=en

Course content and objectives

This course offers an introduction to the mathematical theory of regression as a part of statistical inference and machine learning.  Applications of regression with real-life data will be treated too, mainly in the Exercises and Projects. The course begins with linear (single and multiple) regression as they are simple, yet well known to be useful in many applications. For these models, fitting, parametric and model inference, prediction, hypothesis testing, and model choice will be covered in mathematical detail in the lectures.

A special attention will be paid to the diagnostic strategies which are key components of good model fitting. Further topics include transformations and weightings to correct model inadequacies, the multicollinearity issue and shrinkage regression methods, and variable selection. Later in the course, some general theories for regression modeling will be presented with a particular focus on the generalized linear models (GLM) using examples with binary and count response variables. Bayesian regression is also covered.

The twenty-first century has been an efflorescence of computer-based regression techniques which are integrated into the course based on the statistical software package R.

The overall goal of the course is twofold: to acquaint students with the statistical methodology of regression modeling and to develop advanced practical skills that are necessary for applying regression analysis to a real-world data analytics problem. The course is lectured and examined in English.

Recommended prerequisites

  • SF1901 or equivalent course of the type 'a first course in probability and statistics'.
  • Multivariate normal distribution (to the extent of SF2940, probability theory).
  • Basic differential and integral calculus, basic linear algebra.

Examination

  • Computer projects (3.0 cr): there are two compulsory computer projects that are to be submitted as written reports. Each report should be written by a group of two (2) students, although individual reports are also accepted. The computer projects will be graded with Pass/Fail.
  • The written exam (4.5 cr): the exam will be held on Tuesday 11th March 2025, 08.00- 13.00. The exam (4.5 cr) consists of 6 problems, every problem counts for a total of 6 points. The preliminary score needed to pass the exam is 18.
  • Final grades for the whole course are set according to the quality of the written examination. Grades are given in the range A-F, where A is the best and F means failed. Fx means that you have the right to a complementary examination (to reach the grade E). The criteria for Fx is a grade F on the exam, and that an isolated part of the course can be identified where you have shown a particular lack of knowledge and that the examination after a complementary examination on this part can be given the grade E.

GRADE LIMITS (BETYGSGRÄNSERNA) 

  • A:    34-36
  • B:    30-33
  • C:    26-29
  • D:    22-25
  • E:    18-21
  • Fx:  15-17 
  • F:    < 15 

Written Exam Guidelines

Approximately two weeks before the exam, a list of around 36 questions (the "Exam Generator") will be provided to guide your preparation. The written exam will consist of six (6) of these questions (or slightly modified versions), with each question worth six points.

You can prepare answers and solutions by thoroughly studying the relevant chapters in the course textbooks. Be aware that derivations and concepts covered on the board during lectures are also part of the exam material. Additionally, proficiency in fundamental calculus, probability, linear algebra, and matrix calculus is essential.

The same set of preparatory questions (with potential minor adjustments such as removals or additions) will apply to the re-exam. Consequently, we will not be distributing a solutions manual.

Please note that no additional formula sheets, notes, or textbooks will be allowed in the exam. However, some relevant formulas required for solving exam questions will be provided in the exam sheet.