LEC # | TOPICS | READINGS |
---|---|---|
1 | Introduction, linear classification, perceptron update rule | |
2 | Perceptron convergence, generalization | |
3 | Maximum margin classification |
OptionalCristianini, Nello, and John Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge, UK: Cambridge University Press, 2000. ISBN: 9780521780193. Burges, Christopher. “A Tutorial on Support Vector Machines for Pattern Recognition.” Data Mining and Knowledge Discovery 2, no. 2 (June 1998): 121-167. |
4 | Classification errors, regularization, logistic regression | |
5 | Linear regression, estimator bias and variance, active learning | |
6 | Active learning (cont.), non-linear predictions, kernals | |
7 | Kernal regression, kernels | |
8 | Support vector machine (SVM) and kernels, kernel optimization |
Short tutorial on Lagrange multipliers (PDF) OptionalStephen Boyd’s course notes on convex optimization Boyd, Stephen, and Lieven Vandenberghe. Convex Optimization. Cambridge, UK: Cambridge University Press, 2004. ISBN: 9780521833783. |
9 | Model selection | |
10 | Model selection criteria | |
Midterm | ||
11 | Description length, feature selection | |
12 | Combining classifiers, boosting | |
13 | Boosting, margin, and complexity |
OptionalSchapire, Robert. “A Brief Introduction to Boosting.” Proceedings of the 16th International Joint Conference on Artificial Intelligence, 1999, pp. 1401-1406. |
14 | Margin and generalization, mixture models |
OptionalBartlett, Peter, Yoav Freund, Wee sun Lee, and Robert E. Schapire. “Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods.” Annals of Statistics 26, no. 5 (1998): 1651-1686. |
15 | Mixtures and the expectation maximization (EM) algorithm | |
16 | EM, regularization, clustering | |
17 | Clustering | |
18 | Spectral clustering, Markov models |
OptionalShi, Jianbo, and Jitendra Malik. “Normalized Cuts and Image Segmentation.” IEEE Transactions on Pattern Analysis and Machine Intelligence 22, no. 8 (2000): 888-905. |
19 | Hidden Markov models (HMMs) |
OptionalRabiner, Lawrence R. “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.” Proceedings of the IEEE 77, no. 2 (1989): 257-286. |
20 | HMMs (cont.) | |
21 | Bayesian networks |
OptionalHeckerman, David. “A Tutorial on Learning with Bayesian Networks.” In Learning in Graphical Models by Michael I. Jordan. Cambridge, MA: MIT Press, 1998. ISBN: 9780262600323. |
22 | Learning Bayesian networks | |
23 |
Probabilistic inference Guest lecture on collaborative filtering |
|
Final | ||
24 | Current problems in machine learning, wrap up |
References
Bishop, Christopher. Neural Networks for Pattern Recognition. New York, NY: Oxford University Press, 1995. ISBN: 9780198538646.
Duda, Richard, Peter Hart, and David Stork. Pattern Classification. 2nd ed. New York, NY: Wiley-Interscience, 2000. ISBN: 9780471056690.
Hastie, T., R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York, NY: Springer, 2001. ISBN: 9780387952840.
MacKay, David. Information Theory, Inference, and Learning Algorithms. Cambridge, UK: Cambridge University Press, 2003. ISBN: 9780521642989. Available on-line here.
Mitchell, Tom. Machine Learning. New York, NY: McGraw-Hill, 1997. ISBN: 9780070428072.
Cover, Thomas M., and Joy A. Thomas. Elements of Information Theory. New York, NY: Wiley-Interscience, 1991. ISBN: 9780471062592.