Modified method for removing multicollinearity problem in multiple regression model

Multicollinearity happens when two or more independent variables in a multiple regression model are highly correlated. This increases the standard errors as the coefficients cannot be estimated accurately. Insignificant variable which does not contribute to a model may also affect the interpretation...

Full description

Saved in:
Bibliographic Details
Main Author: Yap, Sue Jinq
Format: Thesis
Language:English
English
Published: 2014
Subjects:
Online Access:https://eprints.ums.edu.my/id/eprint/41239/1/24%20PAGES.pdf
https://eprints.ums.edu.my/id/eprint/41239/2/FULLTEXT.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-ums-ep.41239
record_format uketd_dc
spelling my-ums-ep.412392024-10-18T07:17:12Z Modified method for removing multicollinearity problem in multiple regression model 2014 Yap, Sue Jinq QA273-280 Probabilities. Mathematical statistics Multicollinearity happens when two or more independent variables in a multiple regression model are highly correlated. This increases the standard errors as the coefficients cannot be estimated accurately. Insignificant variable which does not contribute to a model may also affect the interpretation of data. Therefore, the key objective of this work is to develop a best model that is free from multicollinearity problem and insignificant variables. Originally, there are 25 variables in the data set. Using factor analysis, correlation coefficient values and dummy transformation the following variables are identified: body weight as dependent variable, chest diameter, shoulder girth, chest girth, bicep girth, forearm girth and wrist girth each as single quantitative independent variable and ankle diameter, biacromial diameter, elbow diameter, wrist diameter and gender each as dummy variable. The interaction variables involved here is up to the fifth-order (product of 6 variables). Variables which are lowly correlated with dependent variable are not removed, but are transformed into dummy variables. This work also identifies the significance of interaction variables and variables which are lowly correlated with dependent variables in an analysis. So, applying the concept of backward elimination, multicollinearity and coefficient tests are employed to discard variables systematically from each of all possible models. Multicollinearity source variables are removed using a modified method on the Zainodin-Noraini multicollinearity remedial method. Finally, a best model is obtained, free from multicollinearity problem and insignificant variables. Interaction variables are found to play important role as the best model consists of two single quantitative independent variables (chest diameter, forearm girth), four first-order interaction variables (chest girth and wrist girth, and bicep girth each with biacromial, ankle, gender) and one second-order interaction variable (chest girth, chest diameter and shoulder girth). The highest interaction order found in the best model is up to the second-order. Variables which are lowly correlated with dependent variable (biacromial diameter, ankle diameter and gender) are found to be significant and appear in the best model as interaction variables with bicep girth, respectively. Thus, the results of this work suggest a suitable procedure for researchers when dealing with a large number of independent variables. 2014 Thesis https://eprints.ums.edu.my/id/eprint/41239/ https://eprints.ums.edu.my/id/eprint/41239/1/24%20PAGES.pdf text en public https://eprints.ums.edu.my/id/eprint/41239/2/FULLTEXT.pdf text en validuser masters Universiti Malaysia Sabah Sekolah Sains dan Teknologi
institution Universiti Malaysia Sabah
collection UMS Institutional Repository
language English
English
topic QA273-280 Probabilities
Mathematical statistics
spellingShingle QA273-280 Probabilities
Mathematical statistics
Yap, Sue Jinq
Modified method for removing multicollinearity problem in multiple regression model
description Multicollinearity happens when two or more independent variables in a multiple regression model are highly correlated. This increases the standard errors as the coefficients cannot be estimated accurately. Insignificant variable which does not contribute to a model may also affect the interpretation of data. Therefore, the key objective of this work is to develop a best model that is free from multicollinearity problem and insignificant variables. Originally, there are 25 variables in the data set. Using factor analysis, correlation coefficient values and dummy transformation the following variables are identified: body weight as dependent variable, chest diameter, shoulder girth, chest girth, bicep girth, forearm girth and wrist girth each as single quantitative independent variable and ankle diameter, biacromial diameter, elbow diameter, wrist diameter and gender each as dummy variable. The interaction variables involved here is up to the fifth-order (product of 6 variables). Variables which are lowly correlated with dependent variable are not removed, but are transformed into dummy variables. This work also identifies the significance of interaction variables and variables which are lowly correlated with dependent variables in an analysis. So, applying the concept of backward elimination, multicollinearity and coefficient tests are employed to discard variables systematically from each of all possible models. Multicollinearity source variables are removed using a modified method on the Zainodin-Noraini multicollinearity remedial method. Finally, a best model is obtained, free from multicollinearity problem and insignificant variables. Interaction variables are found to play important role as the best model consists of two single quantitative independent variables (chest diameter, forearm girth), four first-order interaction variables (chest girth and wrist girth, and bicep girth each with biacromial, ankle, gender) and one second-order interaction variable (chest girth, chest diameter and shoulder girth). The highest interaction order found in the best model is up to the second-order. Variables which are lowly correlated with dependent variable (biacromial diameter, ankle diameter and gender) are found to be significant and appear in the best model as interaction variables with bicep girth, respectively. Thus, the results of this work suggest a suitable procedure for researchers when dealing with a large number of independent variables.
format Thesis
qualification_level Master's degree
author Yap, Sue Jinq
author_facet Yap, Sue Jinq
author_sort Yap, Sue Jinq
title Modified method for removing multicollinearity problem in multiple regression model
title_short Modified method for removing multicollinearity problem in multiple regression model
title_full Modified method for removing multicollinearity problem in multiple regression model
title_fullStr Modified method for removing multicollinearity problem in multiple regression model
title_full_unstemmed Modified method for removing multicollinearity problem in multiple regression model
title_sort modified method for removing multicollinearity problem in multiple regression model
granting_institution Universiti Malaysia Sabah
granting_department Sekolah Sains dan Teknologi
publishDate 2014
url https://eprints.ums.edu.my/id/eprint/41239/1/24%20PAGES.pdf
https://eprints.ums.edu.my/id/eprint/41239/2/FULLTEXT.pdf
_version_ 1818611384760401920