Modified method for removing multicollinearity problem in multiple regression model
Multicollinearity happens when two or more independent variables in a multiple regression model are highly correlated. This increases the standard errors as the coefficients cannot be estimated accurately. Insignificant variable which does not contribute to a model may also affect the interpretation...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English |
Published: |
2014
|
Subjects: | |
Online Access: | https://eprints.ums.edu.my/id/eprint/41239/1/24%20PAGES.pdf https://eprints.ums.edu.my/id/eprint/41239/2/FULLTEXT.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-ums-ep.41239 |
---|---|
record_format |
uketd_dc |
spelling |
my-ums-ep.412392024-10-18T07:17:12Z Modified method for removing multicollinearity problem in multiple regression model 2014 Yap, Sue Jinq QA273-280 Probabilities. Mathematical statistics Multicollinearity happens when two or more independent variables in a multiple regression model are highly correlated. This increases the standard errors as the coefficients cannot be estimated accurately. Insignificant variable which does not contribute to a model may also affect the interpretation of data. Therefore, the key objective of this work is to develop a best model that is free from multicollinearity problem and insignificant variables. Originally, there are 25 variables in the data set. Using factor analysis, correlation coefficient values and dummy transformation the following variables are identified: body weight as dependent variable, chest diameter, shoulder girth, chest girth, bicep girth, forearm girth and wrist girth each as single quantitative independent variable and ankle diameter, biacromial diameter, elbow diameter, wrist diameter and gender each as dummy variable. The interaction variables involved here is up to the fifth-order (product of 6 variables). Variables which are lowly correlated with dependent variable are not removed, but are transformed into dummy variables. This work also identifies the significance of interaction variables and variables which are lowly correlated with dependent variables in an analysis. So, applying the concept of backward elimination, multicollinearity and coefficient tests are employed to discard variables systematically from each of all possible models. Multicollinearity source variables are removed using a modified method on the Zainodin-Noraini multicollinearity remedial method. Finally, a best model is obtained, free from multicollinearity problem and insignificant variables. Interaction variables are found to play important role as the best model consists of two single quantitative independent variables (chest diameter, forearm girth), four first-order interaction variables (chest girth and wrist girth, and bicep girth each with biacromial, ankle, gender) and one second-order interaction variable (chest girth, chest diameter and shoulder girth). The highest interaction order found in the best model is up to the second-order. Variables which are lowly correlated with dependent variable (biacromial diameter, ankle diameter and gender) are found to be significant and appear in the best model as interaction variables with bicep girth, respectively. Thus, the results of this work suggest a suitable procedure for researchers when dealing with a large number of independent variables. 2014 Thesis https://eprints.ums.edu.my/id/eprint/41239/ https://eprints.ums.edu.my/id/eprint/41239/1/24%20PAGES.pdf text en public https://eprints.ums.edu.my/id/eprint/41239/2/FULLTEXT.pdf text en validuser masters Universiti Malaysia Sabah Sekolah Sains dan Teknologi |
institution |
Universiti Malaysia Sabah |
collection |
UMS Institutional Repository |
language |
English English |
topic |
QA273-280 Probabilities Mathematical statistics |
spellingShingle |
QA273-280 Probabilities Mathematical statistics Yap, Sue Jinq Modified method for removing multicollinearity problem in multiple regression model |
description |
Multicollinearity happens when two or more independent variables in a multiple regression model are highly correlated. This increases the standard errors as the coefficients cannot be estimated accurately. Insignificant variable which does not contribute to a model may also affect the interpretation of data. Therefore, the key objective of this work is to develop a best model that is free from multicollinearity problem and insignificant variables. Originally, there are 25 variables in the data set. Using factor analysis, correlation coefficient values and dummy transformation the following variables are identified: body weight as dependent variable, chest diameter, shoulder girth, chest girth, bicep girth, forearm girth and wrist girth each as single quantitative independent variable and ankle diameter, biacromial diameter, elbow diameter, wrist diameter and gender each as dummy variable. The interaction variables involved here is up to the fifth-order (product of 6 variables). Variables which are lowly correlated with dependent variable are not removed, but are transformed into dummy variables. This work also identifies the significance of interaction variables and variables which are lowly correlated with dependent variables in an analysis. So, applying the concept of backward elimination, multicollinearity and coefficient tests are employed to discard variables systematically from each of all possible models. Multicollinearity source variables are removed using a modified method on the Zainodin-Noraini multicollinearity remedial method. Finally, a best model is obtained, free from multicollinearity problem and insignificant variables. Interaction variables are found to play important role as the best model consists of two single quantitative independent variables (chest diameter, forearm girth), four first-order interaction variables (chest girth and wrist girth, and bicep girth each with biacromial, ankle, gender) and one second-order interaction variable (chest girth, chest diameter and shoulder girth). The highest interaction order found in the best model is up to the second-order. Variables which are lowly correlated with dependent variable (biacromial diameter, ankle diameter and gender) are found to be significant and appear in the best model as interaction variables with bicep girth, respectively. Thus, the results of this work suggest a suitable procedure for researchers when dealing with a large number of independent variables. |
format |
Thesis |
qualification_level |
Master's degree |
author |
Yap, Sue Jinq |
author_facet |
Yap, Sue Jinq |
author_sort |
Yap, Sue Jinq |
title |
Modified method for removing multicollinearity problem in multiple regression model |
title_short |
Modified method for removing multicollinearity problem in multiple regression model |
title_full |
Modified method for removing multicollinearity problem in multiple regression model |
title_fullStr |
Modified method for removing multicollinearity problem in multiple regression model |
title_full_unstemmed |
Modified method for removing multicollinearity problem in multiple regression model |
title_sort |
modified method for removing multicollinearity problem in multiple regression model |
granting_institution |
Universiti Malaysia Sabah |
granting_department |
Sekolah Sains dan Teknologi |
publishDate |
2014 |
url |
https://eprints.ums.edu.my/id/eprint/41239/1/24%20PAGES.pdf https://eprints.ums.edu.my/id/eprint/41239/2/FULLTEXT.pdf |
_version_ |
1818611384760401920 |