Introduction to Applied Statistical Learning - CI/MS
Teacher
PERCHET Vianney
Department: Statistics
ECTS:
1
Course Hours:
12
Tutorials Hours:
6
Language:
French
Examination Modality:
written exam
Objective
This course is a comprehensive introduction to machine learning methods. It will introduce the typical problems of data description and modeling in order to better predict the response of a new individual. We will describe the algorithms and quantify their good behavior and, in parallel, through R-based work sessions, we will see how to use these methods in practice.
At the end of this course, the students should be able to
- Set up classification or regression methods
- Know the theory of the methods presented
- Read and interpret the digital outputs of these methods
Planning
Introduction.
- Difference between estimation (statistical) and prediction (ML); definition of loss functions, risk, empirical risk.
Classification algorithms.
- Methods from statistics, linear discrimination. Nearest neighbor method and other universally consistent methods. Decision trees and Random forests.
Regression algorithms.
- Least squares method. Penalization methods: RIDGE estimator, LASSO estimator and Elastic Net.
Selection of estimators.
- Empirical risk minimization methods. Learning and test data. Cross-validation.
References
Devroye, Györfi, Lugosi - A Probabilistic Theory of Pattern Recognition - (1996) Springer-Verlag
Hastie, Tibshirani, Friedman - The Elements of Statistical Learning - (2008) Springer Series in Statistics