Welcome to the class website of Learning from Data!

News
  • 2024-6-11: Please note that June 19 is the date to submit poster PDF file and Poster session will be hosted in June 21!

  • 2024-5-31: Please note that 11th June to 14th June is the Final Project Checkpoint (make appointment with teaching staff and discuss progress)

  • 2024-4-8: The midterm of this semester will be on 19th April, and a review session will be held by TAs later. Please kindly arrange your time and make preparations accordingly.

  • 2024-4-7: The ddl for PA2 will be extended to 10th AprilWA2 will be due on 13th April.

  • 2024-2-25: The first class will be scheduled on Friday, March 1st, from 9:50am-12pm for this week.

Class info

  • Time: Friday 9:50am-12:15pm

  • Location: International Phase 1(深圳国际一期) A307

This introductory course gives an overview of many concepts, techniques, and algorithms in machine learning, from linear models such as logistic regression and SVM to more advanced topics such as deep neural networks and reinforcement learning. The course will give the student the basic ideas and intuition behind modern machine learning methods as well as a formal understanding of how, why, and when they work. The underlying theme in the course is statistical inference as it provides the foundation for most of the methods covered.

For more information about grading, homework and exam policies, see the class syllabus

Prerequisites: Undergraduate level calculus, probability and linear algebra. Basic Python programming.

Team

Yang Li
Instructor

Yanru Wu
Head TA

Boshi Tang
TA

Jiahao Lai
TA

Office hours

Name Time Location
Yang Friday 2:00-4:00pm Info Building 1108a
Yanru Tuesday 4:00pm-6:00pm Info Building, 11th floor common area
Boshi Wednesday 2:00pm-4:00pm Info Building 1701
Jiahao Thursday 4:00pm-6:00pm Info Building, 11th floor common area

You can also make appointments outside office hours.

Recitation & Review Sessions

Recitations will be held every Friday 9:00-9:45am in the lecture room.

Date Topic Reference
N/A 2022 Review Session: Basic linear algebra and probability; Scientific programming in Python. (programming demo) | (math review) | (math review notes) | (recording) The Maxtrix Cookbook by KB Petersen

3/8 WA0 Discussion (WA0 solution) (Github Classroom Tutorial)

3/15 Probability review

3/22 GLM review and geometric interpretation of linear regression (recording)

3/29 weighted linear regression and KKT condition (recording)

4/7 WA1 homework discussion (recording)

4/14 Midterm review (recording)

review session notes

5/5 Programming Skills Discussion (recording)

skill.pdf notes

5/24 SVD Discussion (recording)

Class Schedule

The main reference reading material is the CS229 Machine Learning Lecture Notes (MLLN) by Andrew Ng and Tengyu Ma

Date Topic Readings & References Homework Release
3/1

Written Assignment 0 (don't need to submit)

3/8

"Supervised learning" 1.1-1.3,2.1-2.3 (MLLN)
Convex functions by Boyd & Vandenberghe (see Chapter 3)

Programming Assignment 1 due Mar 22

3/15

"Generalized linear models" (MLLN) Generalized Linear Models by Nelder & Wedderburn (1972)

Written Assignment 1 (due to Mar 29)

3/22

"Generative learning algorithms" (MLLN)
Reference: Event models for NB text classification by McCallum & Nigam (1998)

Programming Assignment 2 due April 9th

3/29

"Support Vector Machines" (MLLN)
References: Support Vector Networks by Cortes and Vapnik; SVM notes from “Selected Applications of Convex Optimization” by Li Li; Convex optimization by Stephen Boyd (See Chapter 5)

4/7

“Deep Feed Forword Networks” from Deep Learing by Ian Goodfellow

WA1 solution

4/12

“Regularization and model selection” (MLLN) “Regularization for Deep Learning” from Deep Learning Notes on matrix derivatives by Learned-Miller

Written Assignment 2

4/19 Midterm Exam WA 2 solution

4/26

"Generalization" (MLLN) Final project information Recommended Latex template

5/10 Unsupervised Learning I: (slides with notes) (recording)

  • K-means clustering

  • spectral clustering

“Clustering and the k-means algorithm” (MLLN)

A tutorial on spectral clustering by Ulrike von Luxburg

5/17 Unsupervised Learning II: (slides with notes) (recording)

  • spectral clustering (continued),

  • PCA

"Principal component analysis" (MLLN)

5/24 Unsupervised Learning III:

“Independent Component Analysis” (MLLN) “Canonical Correlation Analysis” by Hardle & Simar

Written Assignment 4 (due to June 7)

5/31

"Reinforcement Learning" (MLLN)
Playing Atari with Deep Reinforcement Learning by Mnih et. al. Extended reference text: Reinforcment Learning: An Introduction 2nd, ed. by Sutton and Barto

6/7

  • LLM Alignment

6/14 Transfer learning (recording) (handout 1) (handout 2)

6/21 Final poster presentation