The course is taught in a sequence of units. Each unit takes between one and two weeks so that the entire class can be fit into a single semester. Most units currently have four or five components:
- Lecture Notes: These are slides accompanying the class lecture. They include code snippets from the demos.
- Lecture Videos: The lecture videos are available on Brightspace.
- Whiteboard: MS OneNote
- Demo: These are python-based Jupyter notebooks for demonstrations given during the lectures. Some demos have a component that is done in class. The demos do not generally cover all topics, since some concepts are left for the students to figure out for themselves in the labs.
- Lab: Following the lecture, the students do a python-based exercise at home
that builds on the demo.
The labs in the repository are given as skeletons with
TODOmarkers that the students fill in. - Problems: These are more analytic problems, also done at home.
The problem and lab solutions are provided to students enrolled in the class. If you are an instructor and wish copies of the solutions for yourself, please contact Sundeep Rangan at srangan@nyu.edu.
-
Syllabus: Spring 2022
-
Setting up python and jupyter notebook
-
Setting up github
-
Introduction
- Course Admin [pdf] [Powerpoint]
-
Unit 1: What is machine learning?
- Lecture: Introduction to Machine Learning [pdf] [Powerpoint]
- Demo: Github
- Demo: Setting up the environment on a local machine
- Demo: Google Cloud Platform
- Demo: Python Tutorial
- Demo: Introduction to numpy vectors
- Demo: Overview of Google Colab
-
Unit 2: Simple linear regression
- Lecture: Simple linear regression [pdf] [Powerpoint]
- Demo: Understanding automobile mpg
- Whiteboard: MS OneNote
- Homework (Due on Feb 8, 23:59 PM ET)
- Problem: [pdf]
- Lab: Boston housing data (submit both .ipynb and .pdf files)
-
Unit 3: Multiple linear regression
- Lecture: Multiple linear regression [pdf] [Powerpoint]
- Demo 1: Predicting glucose levels
- Demo 2: Python broadcasting(Optional)
- Whiteboard: MS OneNote
- In-class Exercise: Linear regression
- Homework (Due on on Feb 15, 23:59 PM ET)
- Lab: Calibrating robot dynamics
- Problems [pdf]
-
Unit 4: Model selection
- Lecture: Model selection [pdf] [Powerpoint]
- Demo 1: Polynomial order selection with cross-validation
- Demo 2: Feature transforms and model validation(Optional)
- Homework (Due on Feb 22, 23:59 PM ET)
- Lab: Neural decoding motor cortex signals
- Problems [pdf]
-
Unit 5: Regularization and LASSO
- Lecture: LASSO Regularization [pdf] [Powerpoint]
- Demo 1: Predicting prostate cancer
- Demo 2: Predicting housing prices
- Homework (Due on Mar 1, 23:59 PM ET)
- Lab: EEG source localization
- Problems [pdf] [Latex]
-
- Lecture: Linear classification and logistic regression [pdf] [Powerpoint]
- Demo: Breast cancer diagnosis via logistic regression
- Homework (Due on Mar 8, 23:59 PM ET)
- Lab: Genetic analysis of Down's syndrome in mice
- Problems: [pdf]
-
Unit 7: Nonlinear optimization
- Lecture: Nonlinear optimization and gradient descent [pdf] [Powerpoint]
- Demo 1: Computing gradients
- Demo 2: Simple gradient descent optimization
- Homework (Due on Mar 21, 23:59 PM ET, MONDAY after the spring break! Solution will be released on March 22, 0:01AM for you to prepare for the exam.)
- Lab: Nonlinear least squares material modeling
- Problems: [pdf]
-
Midterm Exam
- 🔔🔔🔔 Midterm exam on Mar 22 @6PM.
- Closebook exam, no electronic aids allowed.
- Two pages of cheat sheets allowed, you can write on both sides.
- Unless you have been approved by NYU, you have to take the exam in-person.
- 🔔🔔🔔 Midterm exam on Mar 22 @6PM.
-
Unit 8: Support vector machines
- Lecture: SVM [pdf] [Powerpoint]
- Demo 1: MNIST digit classification
- Demo 2: Visualizing kernels
- Homework (Due on Apr 5, 23:59 PM ET)
- Lab: Extended MNIST with letters
- Problems: [pdf]
-
Unit 9: Neural networks with Keras and Tensorflow
- Lecture: Neural networks [pdf] [Powerpoint]
- Supplementary notes with solved problems [pdf] [Latex]
- Demo 1: First neural network in Keras
- Demo 2: MNIST neural network classification
- In-class:Exercise
- Homework (Due on Apr 12, 23:59 PM ET)
- Lab: Music instrument classification
- Problems: [pdf]
-
Unit 10: Convolutional and deep networks
- Lecture: Convolutional and deep networks [pdf] [Powerpoint]
- Setting up a GPU instance (Recommended)
- Demo 1: 2D convolutions and convolutional layers in keras
- Demo 2: Creating an image set using the Flickr API
- Demo 3: Exploring the deep VGG16 network
- Demo 4: Building an image classifier using CIFAR10 dataset
- Demo 5: Building an autoencoder for image denoising using CIFAR10 dataset
- Homework (Due on Apr 19, 23:59 PM ET)
- Lab: Transfer learning with a pre-trained network (GPU recommended)
- Problems: [pdf]
-
- Lecture: PCA [pdf] [Powerpoint]
- Demo 1: PCA eigen-faces-SVM
- Demo 2: Low-rank matrix completion via embedding layers
- Homework (Due on Apr 26, 23:59 PM ET)
- Lab: PCA with hyper-parameter optimization
- Problems: [pdf]
-
- Lecture: Clustering and EM [pdf] [Powerpoint]
- Demo 1: Document clustering via k-means and latent semantic analysis
- Demo 2: Color quantization via k-means and EM-GMM
- Homework (Due on May 3, 23:59 PM ET)
- Problems: [pdf]
-
Unit 13: Decision Trees and Random Forest
- Lecture: Decision tree and random foreast [pdf] [Powerpoint]
- Demo: Prediction of temperature using decision tree and random forest
- Homework (Will NOT be graded)
- Homework [pdf]
-
Final Exam
- 🔔🔔🔔 Final Exam will be on May 10 @6PM.
-
- Course project is OPTIONAL. It consists 20% of your total grade. If you choose not to do it, I will take the higher grade of your midterm and final exams, and use it for your project grade.
- Doing a project will never hurt your grade (assuming you didn't do any plagiarism on the project). If project grade is lower than your exams, we will use your exams for that 20% portion and not consider the project.
- Make sure you don't plagiarise. Please clearly cite the sources you use for your work whether it is a blog post or github repository or a paper published. It is very important you don't miss any source you used. This class and NYU take plagiarism very seriously. If plagiarism is detected you will get a ZERO grade from the project. ( You can definitely use a blog post or a github repository as a starter code: Cite it, clearly write what is the difference of your work from the starter code and how did you contribute.)
- One or two students in a project group.
- List of suggested project topics can be found here.
- Recently published paper + existing code as a starting point for your projects.
- You are free to choose a topic not listed here.
- What do we expect?
- The project to be different than something you can directly find in a blog post or Kaggle. You should always add references to data/code/sentences you copied online.
- You can try to improve the accuracy reported in a blog post;
- You can train on additional data and test on real-world data that you collected yourself;
- You can change the model, try parameter optimization etc;
- You can compare different models and report their success, what kind of datapoints are usually missed with what models etc;
- If you cannot make improvements, tell us why it failed and what are the lessons learned.
- 🔔🔔🔔 Project report is due on the May 17, 23:59 PM ET.
- Submit your code and report on Gradescope;
- Follow the submission guidelines listed here
- Make sure we can duplicate your results. If you used some data that is not publicly available online, please upload to Google drive/Dropbox/Github so that we can access it;
- You need a quad chart at the begining of your report, which serves as an executive summary. You need to very briefly describe what is the motivation, why it should be solved by ML and the tool you picked, what is your contribution (how is it different the existing), results, summary/future directions/lessons learned. Be concise!
- The report should be formated as a paper of few pages long.
------Materials below are outdated and will be updated before each class.------