ELEN 6887 - Statistical Pattern Recognition Theory
This course will introduce you to the basic theory of statistical learning. Click for a tentative list of the course contents for the first part of the course.
Instructor:
Rui Castro
Web: http://www.ee.columbia.edu/~rmcastro
Phone: (+1) 212 854 0513
Office: 716 CEPSR
Office hours: Tue 4pm-6pm or just email for appointment
Lectures and Location
Monday and Wednesday 2:40pm - 3:55pm in 420 Pupin
lecture 1 - 1/21/2009 - A Probabilistic Approach to Pattern Recognition
lecture 2 - 1/26/2009 - Introduction to Classification and Regression
lecture 3 - 1/28/2009 - Introduction to Complexity Regularization
lecture 4 - 2/2/2009 - Estimation of Smooth Functions
lecture 5 - 2/4/2009 - Plug-in Rules and the Histogram Classifier
lecture 6 - 2/9/2009 - Introduction to PAC Learning
lecture 7 - 2/11/2009 - PAC Bounds and Concentration of Measure
- Additional material: Concentration by Colin McDiarmid (in Probabilistic Methods for Algorithmic Discrete Mathematics, 1998, 1-46)
lecture 8 - 2/15/2009 - Bounded Losses: Error Bounds
lecture 9 - 2/17/2009 - Countable Classes of Models
lecture 10 - 2/23/2009 - Complexity Regularization Bounds
lecture 11 - 3/4/2009 - Decision Trees and Classification
lecture 12 - 3/23/2009 - Complexity Regularization and the Squared Loss
lecture 13 - 3/30/2009 - Maximum Likelihood Estimation
lecture 14 - 3/31/2009 - Complexity Regularization and Maximum Likelihood Estimation
lecture 15 - 4/6-8/2009 - Denoising of smooth functions with unknown smoothness
lecture 16 - 4/13/2009 - Approximation using Wavelets
lecture 17 - 4/15-20/2009 - Denoising and Spacial Adaptivity - The Magic of Wavelets
lecture 18 - 4/22/2009 - Introduction to Vapnik-Chervonenkis Theory
lecture 19 - 4/27/2009 - The proof of the VC inequality
lecture 20 - 4/29/2009 - Applying the results of VC theory
Homework
homework 1 - Due February 16th (matlab file needed magnolia.mat)
homework 2 - Due February 23rd - solution
homework 3 - Due March 2nd
homework 4 - Due March 23rd - solution (© 2009 Romain Leplomb)
homework 5 - Due April 8th
Papers
The Boosting Approach to Machine Learning - An Overview by Robert Shapire
Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods by Robert Schapire et al
An Introduction to Kernel-Based Learning Algorithms - Klaus-Robert Muller et al
Stability and Generalization by Olivier Bousquet and Andre Elisseeff
References
There is no formal textbook for this course. Throughout the semester several notes, relevant papers will be distributed. Below you will find a list of further reading materials.
- A probabilistic theory of pattern recognition, Devroye, Gyorfi, Lugosi, Springer
- The Elements of Statistical Learning, Hastie, et al, Springer
- Combinatorial methods in density estimation, Devroye and Lugosi, Springer
- Statistical Learning Theory, Vapnik, Wiley
- An Introduction to Computational Learning Theory, Kearns and Vazirani, MIT Press
Although you will not need to know measure theoretical probability in depth for the course it is does help. Below are two references on the topic.
- Probability and Measure, Billingsley, Wiley
- A Probability Path, Resnick, Birkhäuser
Prerequisites
Competence in applied mathematics and probability. Knowledge of statistics.
Format and Evaluation
The course will consist of several introductory lectures followed by readings and discussions of recent developments in the area. There will be several homework assignments throughout the semester. Your grade will be based on course participation, and homework and paper presentations.