Andrea Passerini

Machine Learning

General information

Degree: Master in Computer Science
Faculty: Scienze MM.FF.NN.
Period: September 2010 - December 2010

Objectives

Provide knowledge of both theoretical and practical aspects of machine learning, of the main techniques of supervised and unsupervised learning and probabilistic reasoning. Show applications of machine learning techniques to real world problems.

Prerequisites

Linear algebra, probability theory (briefly revised during the course). Boolean algebra, knowledge of a programming language.

Content

Introduction to machine learning: designing a machine learning system, learning settings and tasks, decision trees, k-nearest-neighbour estimation. Mathematical foundations: linear algebra, probability theory, statistical tests. Bayesian decision theory, maximum likelihood and Bayesian parameter estimation. Probabilistic graphical models, inference, parameters and structure learning. Neural networks: perceptron, multilayer neural networks. Clustering: k-means, hierarchical clustering. Kernel Machines: kernels, reproducing kernel Hilbert spaces, representer theorem, support vector machines for classification, regression and ranking, kernel construction, kernels for structured data. Statistical Learning Theory: PAC learning, consistency, VC dimension, generalization and models comparison. Applications to text categorization and bioinformatics.

Course Information

Instructor: Andrea Passerini
Email:
joint course with Alessandro Moschitti. See his homepage for his own material
Office hours: Wednesday 10:30-12:30
Lecture time and place: Tuesday 14-16
Wednesday 8:30-10:30
Bibliography: R.O. Duda, P.E. Hart and D.G. Stork, Pattern Classification (2nd edition), Wiley-Interscience, 2001.
C.M.Bishop, Pattern Recognition and Machine Learning, Springer, 2006
T. Mitchell, Machine Learning, McGraw Hill, 1997.
Material: Slides and handouts (pdf format)
Introduction [slides] [handouts]
Decision Trees [slides] [handouts]
K-nearest neighbours [slides] [handouts]
Linear algebra [slides] [handouts]
Probability theory [slides] [handouts]
Bayesian decision theory [slides] [handouts]
Parameter estimation [slides] [handouts]
Bayesian Networks [slides] [handouts]
Inference in BN [slides] [handouts]
Learning BN [slides] [handouts]
Naive Bayes [slides] [handouts]
Expectation Maximization [slides] [handouts]
Hypothesis Testing [slides] [handouts]
Clustering [slides] [handouts]

Exams

Modality: Project [description] and oral examination
Projects: (1) Hierarchical gene clustering [desc] [data] (to: Andrea Zito)
(2) Patient classification by gene expression [desc] [data] (to: Lucas Mariano)
(3) Bayesian Network of pathologies [desc] [data] (to: Danilo Tomasoni e Rino Napo)
(4) Protein subcellular localization [desc] [data] (to: Bhavani Vaidya)
(5) Disulphide bonding state prediction [desc] [data] (to: Mauro Fruet and Mattia Gastaldello)
(6) Comparing learning algorithms [desc] [data] (to: Fabrizia Toss and Antonio Quartulli)
... additional projects will follow ...