SF2526 VT20 (60271) Numerical algorithms for data-intensive science

Welcome to the course

SF2526 - Numerics for data science

Numerical algorithms for data-intensive science

 

Data analysis of large data sets is increasing in importance as a new tool in many fields in science and technology. This course gives an introduction to the use of many efficient numerical algorithms arising in problems in the analysis of large amounts of data. We use mathematical and numerical tools to study problems and algorithms.

 

Spectacle.nnS370.png

 

Want to get started: Try Quiz 0: Background and a bit and check out hw1.pdf under files.

 

 

Course contents and literature:

Before the course starts we recommend you review linear algebra summarized in Download background.pdf

and also do Quiz 0: Background and a bit.

The course consists of three blocks  (the PDF-files are preliminary and will be updated)

What do I need to do?

The examination of the course consists of correctly completing

The course has mandatory homeworks, and each homework can give you bonus points to the exam. If you in addition complete the course training tasks you can get additional bonus. See further description on the page for Homework / bonus points rules

Course training area

Throughout the course you have the possibility to train the material and get teacher feedback. This is done on the course wiki.

The work in the course training area is mainly intended for your own learning, and it can also lead to additional bonus points as described on Homework / bonus points rules

Course plan weekly

Lecture slides are available here (only accessible for registered students).

 

Week
Contents
1 Lecture 1: B1: Course introduction. Low rank approximation applications (image processing, machine learning). Matrix factorizations. QR-factorization.
2
Lecture 2: B1: Methods QR-factorization via Gram-Schmidt (greedy Gram-Schmidt by column elimination). Low-rank approximation via partial QR. CPQR. Best low-rank property (SVD). Eckart-Young theorem.
Lecture 3: B1: SVD from partial QR. Randomized QR.
 3 Lecture 4: B1: Interpolatory decomposition (ID), The CUR decomposition
Lecture 5: B2 Intro to clustering and K-means. Image processing applications. Plastic collection clustering application.  K-means. Similarity graphs.
 4 Lecture 6: B2: Intro to graphs and data on graphs. Connectedness. Subgraph connectedness. Unnormalized laplacian. Laplace eigenvectors and indicator vectors. Mincut.
Lecture 7: B2: Unnormalized Laplacian spectral clustering. K-means in spectral clustering. RatioCUT interpretation of clustering + Rayleigh Ritz theorem.
 5 Lecture 8: B2: RatioCut (k>2). Perturbation viewpoint. Normalized Laplacians. NCut.
Lecture 9: B3: Introduction to applications in signal processing. FFT, Cooley-Tukey, 
 6 Lecture 10: B3: Circulant matrices, diagonalization of circulant matrices, Toeplitz via extension to Circulant and FFT,
Lecture 11: B3: Durbin + Levinson-Durbing cont
 7 Lecture 12 B3: Trench + intro to Semiseperable matrices
  Lecture 13 B3: semiseparable + HODLR matrix algorithm / reserve lecture
Lecture 14:  summary

 

Related courses elsewhere: