The power of monitoring: How to make the most of a contaminated multivariate sample

 

Andrea Cerioli Marco Riani  Anthony C. Atkinson Aldo Corbellini
Department of Economics and Management Department of Economics and Management The London School of Economics, Department of Economics and Management
University of Parma, University of Parma London WC2A 2AE, UK University of Parma,
Italy Italy  UK Italy
andrea.cerioli@unipr.it mriani@unipr.it  a.c.atkinson@lse.ac.uk aldo.corbellini@unipr.it

Abstract


Diagnostic tools must rely on robust high-breakdown methodologies to avoid distortion in presence of contamination by outliers. However, a disadvantage of having a single, even if robust, summary of the data is that important choices have to be made prior to the analysis and their effect may be difficult to evaluate. We argue that an effective solution is to look at several pictures, and possibly to a whole movie, of the available data. This can be achieved by monitoring the results computed through the robust methodology of choice. We show the information gain that monitoring provides in the study of complex data structures through the analysis of multivariate datasets and using different high-breakdown techniques. Our findings support the claim that the principle of monitoring is very flexible and that it can lead to robust estimators that are as efficient as possible. We also address through simulation some of the tricky inferential issues that arise from monitoring.

 

The datasets used in the paper can be downloaded here

The Matlab file which enables to reproduce all Figures (except those from the simulation study) can be downloaded here (FSDA is needed)

 

Last modified 06/11/2017 10.50.41