Syllabus:
# Introduction to Statistics
Lectures : Wednesday
Exercises : Monday
# References
V. Rivoirard et G. Stoltz. *Statistique en action* (...)
Read more
# Introduction to Statistics
- Lectures : Wednesday
- Exercises : Monday
# References
- V. Rivoirard et G. Stoltz. *Statistique en action* Dunod
- P. Bickel and K. Doksum *Mathematical Statistics, Basic Ideas and Selected Topics* Pearson
- J. A. Rice. *Mathematical Statistics and Data Analysis* Wadsworth
- D. Freedman. *Statistical models*. Cambridge University Press
- A. van der Vaart. *Asymptotic statistics*. Cambridge University Press
- L. Devroye et G. Lugosi. *Combinatorial methods in density estimation* Springer
# Map
1. Modeling and statistical inference: estimation, confidence regions, tests.
1. Gaussian Vectors. Conditioning. Concentration.
1. Gaussian Linear Models. Regression. Analysis of variance.
1. Estimation Methods. Moments. Contrast Minimization. Likelihood Methods.
1. Exponential Models. Maximum Likelihood Estimation. Information inequalities.
1. Tests. Fundamental Lemma. Uniformly most powerful tests.
1. Chi-square tests.
1. Non-parametric Tests. Empirical distribution function. Uniform deviation inequalities. Rank tests.
1. Bayesian methods.
1. Risk and efficiency. Minimaxity.
1. Non-parametric models: density estimation. Kernel methods (2 lectures).
# Syllabus
No inference without modeling: statistical inference assumes that data are generated by some random mechanism, the parameters of which are to be estimated. The random generation mechanism (the model) may be parametrized by a finite or an infinite-dimensional set. In this introductory course, most models are finite-dimensional.
In a few words, statistical inference is concerned with three main topics:
- Building point estimates of parameters;
- Building confidence sets for estimands;
- Deciding whether a model fits better than another (testing).
These three questions turn out to be intimately connected.
Theses questions will first be investigated in a Gaussian framework. This allows for transparent proofs. Furthermore, Gaussian models arise as limits of many statistical experiments.
We will then describe a principled approach to the design of inference methods: minimum contrast estimators, moment estimators, maximum likelihood estimators. These techniques will be illustrated on the simplest regular models: the exponential models.
Testing procedures: design and analysis start from the Neyman-Pearson Lemma which settles the question for testing simple binary hypotheses and allows us to face more complicated settings.
Chi-square tests are both practically important and theoretically interesting. Even though their use is ultimately justified by asymptotic arguments, they perform well on samples of moderate size and illustrate convergence rates towards limit distributions.
Non-parametric tests such as the Kolmogorov-Smirnov test illustrate the use of empirical processes techniques that play a fundamental role in non-parametric statistics and in learning theory. These tests allow us to assess qualitative hypotheses that are beyond the scope of parametric techniques.
Bayesian methods may be viewed from two perspectives:
- we may assume that nature picks a data generation mechanism at random according to a known scheme, and try to take advantage of the knowledge of this scheme to design estimation and testing methods.
- we may act *as if* nature were picking at random the data generating mechanism, and design estimators and tests accordingly. What changes then is the style of performance analysis.
Decision theory provides a framework for the systematic analysis of estimation and testing methods. It tells us, among other things, that any minimax method is the limit of Bayesian methods.
The course ends with an introduction to non-parametric estimation (kernel methods in density estimation).