Teachings

Select Academic Year:     2016/2017 2017/2018 2018/2019 2019/2020 2020/2021 2021/2022
Professor

Period
Second Semester
Teaching style
Convenzionale
Lingua Insegnamento
ITALIANO

Informazioni aggiuntive

Course Curriculum CFU Length(h)
[60/65]  MATHEMATICS [65/60 - Ord. 2020]  MATEMATICA APPLICATA 9 72

Objectives

OBJECTIVES
The course consists of two parts: 32 hours of data modeling techniques and 40 hours of introduction to stochastic processes. The the first part of the course aims to present the concepts and the mathematical theory of statistical modelization essential for a rigorous and complete statistical investigation, and also to provide students with the practical tools for applying the techniques learned in order to collaborate in research centers and / or companies in which statistical data are processed.
The main objective of the second part of the course is to provide students with an introduction to the fundamental concepts of the theory of stochastic processes with particular attention to Markov chains and their applications.

1) KNOWLEDGE AND UNDERSTANDING
Ability to formalize and model inferential statistics problems. Knowledge of the fundamental characteristics of some classes of stochastic processes relevant to the applications and ability to rigorously prove the main theoretical results related to their analysis.

2) APPLICATION ABILITIES
Knowing how to develop, through the support of suitable software, a data analysis in all its phases: data acquisition, data description and statistical modeling of the same. Being able to identify random phenomena that can be modeled as stochastic processes and analyze them through the theoretical knowledge acquired in the course.

3) JUDGMENT AUTONOMY
Understanding the principles related to the development of the models presented in the first part of the course, which will guide the student to interpret the results of a statistical analysis and replicate the analysis if necessary. Reaching a global vision on the theory of Markov chains and its applications, which will guide the student in tackling the proposed exercises with a critical sense and in independently deepening topics related to those presented in the second part of the course.

4) COMMUNICATION SKILLS
For the first part of the course the student, at the end, should be able to communicate (orally and in writing) the results of a rigorous statistical analysis. The ability to re-elaborate data and to read the results will make the student able to collaborate in applied research works even outside the mathematical field. At the end of the second part of the course, the student should be able to present the topics covered with an appropriate technical language and a correct mathematical formalism.

5)LEARNING SKILLS
The first part of the course will be supported by a large number of exercises on the use of the statistical software R in all the topics covered for an approach to solving descriptive and advanced inferential statistics problems. The course cannot be exhaustive of all possible methods of data analysis, but it provides the student with the tools to autonomously understand and experiment the application of alternative methods best suited to the analysis of the practical problem of interest. Moreover, the student will acquire the tools to autonomously analyze the classes of stochastic processes treated in the second part of the course. These will provide the student with a solid theoretical basis to analyze further classes of stochastic processes in the continuation of her/his university/working career as well as to orient herself/himself in the scientific literature of the sector.

Prerequisites

It is necessary for the student to know the basics of probability calculus and the descriptive and inferential methods of mathematical statistics. The student should also be able to handle the fundamental tools of linear algebra and matrix calculus.

Contents

The course will focus on the topics listed below.

For the first part:
1. Methods of linear regression: inference for the standard linear regression, Gauss Markov theorem and hypotheses tests on the model parameters, methods of verification of the model hypotheses; multiple linear regression as extension of the standard linear regression, statistical inference, in the models with normal errors, analysis of residuals, measures of fitting and variables selection techniques; considering qualitative variables as covariates, the concept of dummy variable, analysis of covariance and analysis of variance as linear regression.
2.Test ANOVA (analysis of variance) one way and introduction to the two ways framework: introduction to the problem development of the statistics and related distribution. Contrasts’ theory and multiple comparisons.
3. Generalized linear models: model and hypotheses definition, estimation of the parameters, statistical inference. Definition and application to handle binomial responses (logistic regression).
4. Quantile regression: definition of the problem of minimum absolute value, parameter estimation, statistical inference.
5. Chi-squared test for adaptation, independence and homogeneity.
6. Principal non parametric tests for location problem. Sign tests and rank tests. Tests of Wilcoxon, Mahn-Whitney and Kruskal-Wallis.

For the second part:
1. Definition of stochastic process and classification based on time and states.
Fundamental quantities associated with a stochastic process and comparison between two processes.
Finite dimensional distributions and stationary processes.
2. Discrete-time Markov chains.
Definition and transition matrix.
Computation of the finite distributions of a homogeneous Markov chain.
m-step transition matrices and Chapman-Kolmogorov equations
Graph associated with a homogeneous Markov chain.
Canonical representations through recurrence equations.
Stopping times: first visit time and first return time.
Strong Markov property.
Relationship between number of visits and return times in a state.
State classification.
Communicating states and closed classes.
Irreducible and aperiodic chains.
Recurring, transient and absorbing states.
Positive recurrence and null recurrence.
Recurrence and transience criteria.
Stationary regime: invariant distributions and global balance equations.
Time-reversible chains and detailed balance equations.
Recurrence and invariant distributions.
Convergence to equilibrium and ergodic properties.
Examples: random walk, gambler's ruin, Ehrenfest model, branching processes.
Examples: birth and death processes and waiting queues.

Teaching Methods

Compatibly with the hybrid teaching method foreseen in the Manifesto Accademico 2020-21 as a consequence of the COVID-19 emergency, the tools used for the lectures will be both the blackboard and tablet with projection system via classroom screen and via internet streaming. The first part of the course consists in two-hours lectures in which the theory of the methods will be alternated the application of models and the interpretation of results. The statistical software R will be used in the applications. The second part of the course will also consists in two-hour lectures on the theoretical contents always supplemented by illustrative examples and exercises solved together with the students. Exercise sheets will be regularly distributed to students for the self-assessment of their own learning process.

Verification of learning

Compatibly with the indications of the University on how to carry out exams according to the development of the COVID-19 emergency, the exams could be held in presence or online.
The final exam consists of three parts: 1) Writing a statistical report of an analysis implemented in R with real data. In the analysis the student must apply the appropriate models explained during the lectures on the basis of the characteristics of the given data. 2) Writing the solution to an assignment on the second part of the course aimed at verifying the ability of the student to solve problems inherent the theory of stochastic processes. 3) Oral interview in which the student must demonstrate that she/he has acquired the contents presented in both parts of the course and that she/he knows how to apply them correctly. The student can take the oral interview 3) only if sufficiency is reached in the previous parts 1) and 2).

Texts

The contents of the first part of the course will follow:
-Ornello Vitali (1993). Statistica per le Scienze Applicate: volume 1 e 2. Cacucci Editore.
-P. McCullagh and J.A.Nelder (1989). Generalized linear Models. Chapman & Hall/CRC.
-R. Koenker (2005). Quantile regression. Cambridge University Press.
For the computational part with R we recommend:
1. An Introduction to R: https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf
2. Il linguaggio R: concetti introduttivi ed esempi (II edizione) by Vito M. R. Muggeo and Giancarlo Ferrara: https://cran.r-project.org/doc/contrib/nozioniR.pdf
Other text books related with the course are:
- G. Casella, R. L. Berger (2002). Statistical inference, (2nd edn.), Wadsworth Group, CA, USA.

For the second part of the course the following books are recommended:
-P. Bremaud. Markov chains, Gibbs fields, Monte Carlo simulation and queues. Springer, 2020
-S. Ross. Stochastic Processes. John Wiley & Sons, 1996
-D. Kannan. An introduction to Stochastic Processes. North-Holland Scientific Publisher, 1979
-A.K. Basu, Introduction to Stochastic Processes,Narosa Publishing House Pvt. Ltd. , 2003