Course syllabus

Course content and objectives

This course offers an introduction to the  mathematical theory of regression as a part of statsitical inference and  machine  learning.  Applications of regression with real life data will be treated, too,mainly in the Exercises and Projects. The course begins with linear (single and multiple)  regression  as they are simple yet  well known to be useful in many applications. For these models, fitting, parametric and model inference,  prediction,  hypothesis testing    as well as model choice  will be covered in mathematical detail in the lectures.  Here one relies heavily  on linear algebra. 

A special attention will be paid to the diagnostic strategies which are key components of good model fitting. Further topics include transformations and weightings to correct model inadequacies, the multicollinearity issue and shrinkage regression methods, variable selection. Later in the course, some general  theories  for regression modeling will be presented with a particular focus on the generalized linear models (GLM) using the examples with binary and count response variables.

As the high-dimensional data, order of magnitude larger than those that the classic regression theory is designed for, are nowadays a rule rather than an exception in computer-age practice (examples include information technology, finance, genetics and astrophysics, to name just a few), regression methodologies which can deal with high-dimensional scenarios are presented. Here and generalized inverses will be introduced. 

The twenty-first century has been an efflorescence of computer-based regression techniques which are integrated into the course based on the statistical software package R.

The overall goal of the course is twofold: to acquaint students with the statistical methodology of the regression modeling and to develop advanced practical skills that are necessary for applying regression analysis to a real world data analytics problem The course is lectured and examined in English.

Recommended prerequisites

  • SF1901 or equivalent course of the type 'a first course in probability and statistics'.
  • Multivariate normal distribution (to the extent of SF2940, probability theory).
  • Basic differential and integral calculus, basic linear algebra.

Examination

  • Computer projects (3.0 cr): there are two compulsory computer projects that are to be submitted as written reports. Each report should be written by a group of two (2) students, although individual reports are also accepted. The computer projects will be graded with Pass/Fail.
  • The written exam (4.5 cr): the exam will be held on Monday 11th March 2024,  08.00- 13.00. The exam (4.5 cr) consists of 6 problems, every problem counts for a total of 6 points. The preliminary score needed to pass the exam is 18.
  • Final grades for the whole course are set according to the quality of the written examination. Grades are given in the range A-F, where A is the best and F means failed. Fx means that you have the right to a complementary examination (to reach the grade E). The criteria for Fx is a grade F on the exam, and that an isolated part of the course can be identified where you have shown a particular lack of knowledge and that the examination after a complementary examination on this part can be given the grade E.

    GRADE LIMITS (BETYGSGRÄNSERNA) 

      A:    34-36

      B:     30-33

       C:    26-29

       D:    22-25

        E:    18-21

        Fx:  15-17 

         F:    < 15 

 

 

.