Bayesian logistic regression model on risk factors of type 2 diabetes mellitus

Logistic regression model has long been known and it is commonly used in analysing a binary outcome or dependent variable and connects the binary dependent variable to several independent variables. Estimates of the coefficients for the variables are obtained via the method of maximum likelihood...

Full description

Saved in:
Bibliographic Details
Main Author: Chiaka, Emenyonu Sandra
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/69118/1/FS%202016%2045%20UPM%20IR.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Logistic regression model has long been known and it is commonly used in analysing a binary outcome or dependent variable and connects the binary dependent variable to several independent variables. Estimates of the coefficients for the variables are obtained via the method of maximum likelihood based on the frequentist point of view. However, Bayesian analysis allows the incorporation of the prior information and the coefficients of the logistic regression model are estimated by assuming prior distribution for each of the coefficient of interest, which then combines with the likelihood function for the posterior distribution to be obtained. The Bayesian logistic regression methods made use of the metropolis hasting (Random walk algorithm) and the Gibbs sampler with the incorporation of non-informative flat prior and non-informative non-flat prior distributions to obtain the posterior distribution for each coefficient of the variables. Although we incorporated the flat prior distribution, it has been shown to be widely used in different fields of study. However, this work also incorporated a non-flat prior, which is our main research and to the best of our knowledge has not been incorporated on any T2DM dataset in Malaysia. This study evaluates the risk factors such as age, ethnicity, gender, physical activity, hypertension, body mass index, family history of diabetes and waist circumference. The coefficients of the variables mentioned above were estimated by the method of maximum likelihood and significant variables were further identified. The significant variables determined by maximum likelihood method were then estimated using the BLR method. The BLR approach via Gibbs sampler and the random walk metropolis algorithm suggests that family history of diabetes, waist circumference and the body mass index are the significant risk factors associated with the type 2 diabetes mellitus. The model results also show a slight decrease in the posterior standard deviation associated with the parameters generated from the Bayesian analysis with the non-flat prior distribution compared to the results generated from the Bayesian analysis incorporating the non-informative prior. Having seen that the difference between the models is not much, consequently from all indications, all the models are good and they exhibited model fit.