Learning from Data (Fall 2024)

Welcome to the class website of Learning from Data!

News

2024-9-10: The review session will be held on Thursday（9/12） evening 7:00-9:00PM, at International Phase 1, Room C501.
2024-10-28: Our midterm exam will take place on November 1st during class time. To ensure proper spacing, we have arranged two exam rooms: International Phase C405 and C606. Please check the Excel sheet to find your assigned exam room. If you notice that you have not been assigned a room, please contact me as soon as possible. Wishing everyone good luck on the exam!
2024-11-19: From November 25th to November 28th, you will have the opportunity to meet with the TA to discuss your final project proposal. After meeting with the TA, you should wait at least 3 days before scheduling a follow-up meeting with the teacher for further discussion. The discussion will account for 5% of your final project grade. Online Schedule
2024-11-23: A questionnaire about your participation during the last half of the semester. It is mandatory for everyone and is due by 23:59 on November 25th. As this is a self-assessment, we kindly ask that you answer the questions honestly. Questionnaire

Class info

Time: Friday 9:50am-12:15pm
Location: International Phase 1 (国际一期) C405A

This introductory course gives an overview of many concepts, techniques, and algorithms in machine learning, from linear models such as logistic regression and SVM to more advanced topics such as deep neural networks and reinforcement learning. The course will give the student the basic ideas and intuition behind modern machine learning methods as well as a formal understanding of how, why, and when they work. The underlying theme in the course is statistical inference as it provides the foundation for most of the methods covered.

For more information about grading, homework and exam policies, see the class syllabus

Prerequisites: Undergraduate level calculus, probability and linear algebra. Basic Python programming.

✨Q&A Document✨: This semester we encourage students to ask questions and discuss together on the online document. Come and join the society 😊

Team

Yang Li
Instructor

Weida Wang
Head TA

Yuanbo Tang
TA

Tong Wu
TA

Chengfeng Wu
TA

Office hours

Name	Time	Location
Yang	Monday 2:00-4:00pm	Info Building 1108a
Weida	Thursday 5:00-6:00pm	Info Building, 11th floor common area
Yuanbo	Wednesday 5:00-6:00pm	Info Building, 11th floor common area
Tong	Friday 5:00-6:00pm	Info Building, 11th floor common area

You can also make appointments outside office hours.

Recitation & Review Sessions

Recitations will be held every Friday in the lecture room.

Date	Topic	Reference
9/12	Review Session： Basic linear algebra and probability; Scientific programming in Python. (Probability Theory, Coding Prerequisites, Python Tutorial \| Course Video)	The Maxtrix Cookbook by KB Petersen.
9/27	Recitaiton: Two example of GLM. (GLM \| Course Video)
10/11	Recitaiton: PA1 Q&A, KKT Condition, Matrix Derivative (Course Materials & Video)
10/18	Recitaiton: Backpropagation & Forward-Forward Algorithm (Course Material1 \| Course Material2 \| Video)	Forward-Forward Algorithm by Geoffrey Hinton
10/26	Midterm Review (Course Materials & Video)
11/15	Recitation: Learning Bounds (Course Materials & Video)
11/22	Recitation: Discussion on Programming Homework (Course Materials & Video)

Class Schedule

The main reference reading material is the CS229 Machine Learning Lecture Notes (MLLN) by Andrew Ng and Tengyu Ma

Date	Topic	Readings & References	Homework Release
9/13	Introduction (Slides)		Written Assignment 0 (don't need to submit)
9/20	Supervised Learning I: Linear regression, logistic regression, multi-class classificaiton (Slides)	"Supervised learning" 1.1-1.3,2.1-2.3 (MLLN) Convex functions by Boyd & Vandenberghe (see Chapter 3)	Programming Assignment 1
9/27	Supervised Learning II: Generalized linear models, gaussian discriminative analysis (GDA), naive bayes (Slides1 \| Slides2)	"Generalized linear models" (MLLN) Generalized Linear Models by Nelder & Wedderburn (1972) "Generative learning algorithms" (MLLN) Event models for NB text classification by McCallum & Nigam (1998)	Written Assignment 1
10/11	Supervised Learning III: Support Vector Machines (SVM) (Slides)	"Support Vector Machines" (MLLN) References: Support Vector Networks by Cortes and Vapnik; SVM notes from “Selected Applications of Convex Optimization” by Li Li; Convex optimization by Stephen Boyd (See Chapter 5)	Programming Assignment 2
10/18	Supervised Learning IV: Neural Networks and Backpropagation (Slides1 \| Slides2)	"Support Vector Machines" (MLLN) “Deep Feed Forword Networks” from Deep Learing by Ian Goodfellow	Written Assignment 2, WA1 solution
10/25	Model Selection and Regularization (Slides)	“Regularization and model selection” (MLLN) “Regularization for Deep Learning”
11/08	Basic Learning Theory (Slides)	Reading for Generalization Bound: Rademacher Complexity Paper on Rademacher generalization bound (with application to SVM, decision trees, and neural networks)	Programming Assignment 3
11/15	Unsupervised Learning I (Slides)	“Clustering and the k-means algorithm” (MLLN) A tutorial on spectral clustering by Ulrike von Luxburg	Written Assignment 3
11/22	Unsupervised Learning II : PCA (Slides)	"Principal component analysis" (MLLN)
11/29	Reinforcement Learning (Slides)	Model-based and model-free RL; Helicopter flight via reinforcement learning	Programming Assignment 4
12/06	Unsupervised Learing III: ICA, CCA (Slides)	“Independent Component Analysis” (MLLN) Deep CCA	Written Assignment 4
12/13	Deep Learning Architectures for Computer Vision and Natural Language Processing (Slides)
12/20	Deep Representation Learning and Foundation Models (Slides)