Robust Diagnostics In Logistic Regression Model
In recent years, due to inconsistency and sensitivity of the Maximum Likelihood Estimator (MLE) in the presence of high leverage points and residual outliers, diagnostic has become an essential part of logistic regression model. High leverage points and residual outliers have huge tendency to bre...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English |
Published: |
2010
|
Subjects: | |
Online Access: | http://psasir.upm.edu.my/id/eprint/12362/1/FS_2010_19A.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In recent years, due to inconsistency and sensitivity of the Maximum Likelihood
Estimator (MLE) in the presence of high leverage points and residual outliers,
diagnostic has become an essential part of logistic regression model. High
leverage points and residual outliers have huge tendency to break the covariate
pattern resulting in biased parameter estimates. The identification of high
leverage points and residual outliers are believed to be vital in order to improve
the performance of the MLE.
The presence of high leverage points and the residual outliers give adverse effect
on the inferences by inducing large values to the Influence Function (IF). For the
identification of high leverage points, Imon (2006) proposed the Distance from
the Mean (DM) diagnostic method. The weakness of the DM method is that it
tends to swamp some low leverage points even though it can identify the high leverage points correctly. Deleting the low leverage points may lead to a loss of
efficiency and precision of the parameter estimates.
The Robust Logistic Diagnostic (RLGD) is proposed as an alternative approach
that performs well compared to the DM method. The RLGD method
incorporates robust approaches and diagnostic procedures. Robust approach is
firstly used to identify suspected high leverage points by computing the Robust
Mahalanobis Distance (RMD) based on Minimum Volume Ellipsoid (MVE)
estimator or Minimum Covariance Determinant (MCD) estimator. For
confirmation, the diagnostic procedure is used to compute potential. The RLGD
method ensures only correct high leverage points are identified and free from the
swamping and masking effects. The performance of the RLGD method is
investigated by real examples and the Monte Carlo simulation study. The real
examples and the simulation results indicate that the RLGD method correctly
identify the high leverage points (increase the probability of the Detection of
Capability (DC)) and manage to reduce the number of swamping low leverage
points (decrease the probability of the False Alarm Rate (FAR)).
The Standardized Pearson Residual (SPR) only successful in identifying a single
residual outlier. The SPR method is less effective when residual outliers are
present in the covariates. The Generalized Standardized Pearson Residual
(GSPR) proposed by Imon and Hadi (2008) is a successful method in identifying
residual outliers. However, in the initial stage of the GSPR method utilizes the
graphical methods which are based on the observation’s judgement and not suitable for higher dimensional covariates. The Modified Standardized Pearson
Residual (MSPR) based on the RLGD method is proposed which is more
reliable. The MSPR method provides an alternative method to the GSPR method
that produces similar result. The attractive feature of the MSPR method is that it
is easier to apply.
This research also utilizes the RLGD method in bootstrap procedures. The
Classical Bootstrap (CB) procedure by Random-x Re-sampling is not robust to
the high leverage points. To accommodate this problem, the newly develop
bootstrap procedures based on the RLGD method which are called the
Diagnostic Logistic Before Bootstrap (DLGBB) and the Weighted Logistic
Bootstrap with Probability (WLGBP) are proposed. In the DLGBB procedure,
the high leverage points are excluded before applying the re-sampling process.
Meanwhile in the WLGBP procedure, the high leverage points are attributed
with low probabilities and consequently having low chances of being selected in
the re-sampling process. Simulation results show that the DLGBB and the
WLGBP procedures are more robust to the high leverage points compared to the
CB procedure. |
---|