Robust Random Regression Imputation method for missing data in the presence of outliers

The Ordinary Least Square (OLS) estimator is the best regression estimator if all the assumptions are met. However, the presence of missing data and outliers can distort the Ordinary Least Squares estimation and increase the variability of the parameters estimates. The main focus of this research i...

Full description

Saved in:
Bibliographic Details
Main Author: John, Ahamefule Happy
Format: Thesis
Language:English
Published: 2013
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/49818/1/FS%202013%2042RR.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The Ordinary Least Square (OLS) estimator is the best regression estimator if all the assumptions are met. However, the presence of missing data and outliers can distort the Ordinary Least Squares estimation and increase the variability of the parameters estimates. The main focus of this research is to take remedial measure in missing data in regression in the presence of outliers. In regression analysis, the dependent variable (Y) is a function of the independent variable X. Thus, in regression, outliers and missing values can come in both X and Y directions. It is very common to use the OLS base Random Regression Imputation (RRI) when missing values are in Y direction. This RRI seems to be a good method if there are no outliers in the data. Unfortunately, this estimate performs poorly in the presence of outliers. It is because the RRI is OLS base imputation method and OLS is largely affected by outliers. As such, we modified an OLS base Random Regression Imputation (RRRI) methods by incorporating the robust MM estimate which is less affected by outliers. The proposed method is compared with some well-known methods of estimating missing data. The results of the study signify that the RRRI method outperforms the existing methods in the presence of outliers. Since in regression, outliers and missing data can come in both directions, we also considered a situation in which observations are missing in the X explanatory variable. In this respect, the Dummy Variable (DV) approach is one of the best approaches to predict the missing data model. However, this approach also becomes poor in the presence of outliers. As an alternative, Robust Inverse Regression Technique is proposed to get the better estimate. By examining the real data and Monte Carlo Simulation studies, it revealed that our proposed robust methods perform better than the classical methods.