Parametric mixture model of three components for modelling heterogeneos survival data

Previous studies showed that two components of survival mixture model performed better than pure classical parametric survival model. However there are crucial needs for three components of survival mixture model due to the behaviour of heterogeneous survival data which commonly comprises of more th...

Full description

Saved in:
Bibliographic Details
Main Author: Mohammed, Yusuf Abbakar
Format: Thesis
Language:eng
eng
Published: 2015
Subjects:
Online Access:https://etd.uum.edu.my/6095/1/s93379_01.pdf
https://etd.uum.edu.my/6095/2/s93379_02.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uum-etd.6095
record_format uketd_dc
institution Universiti Utara Malaysia
collection UUM ETD
language eng
eng
advisor Yatim, Bidin
Ismail, Suzilah
topic QA273-280 Probabilities
Mathematical statistics
spellingShingle QA273-280 Probabilities
Mathematical statistics
Mohammed, Yusuf Abbakar
Parametric mixture model of three components for modelling heterogeneos survival data
description Previous studies showed that two components of survival mixture model performed better than pure classical parametric survival model. However there are crucial needs for three components of survival mixture model due to the behaviour of heterogeneous survival data which commonly comprises of more than two distributions. Therefore in this study two models of three components of survival mixture model were developed. Model 1 is three components of parametric survival mixture model of Gamma distributions and Model 2 is three components of parametric survival mixture model of Exponential, Gamma and Weibull distributions. Both models were estimated using the Expectation Maximization (EM) and validated via simulation and empirical studies. The simulation was repeated 300 times by incorporating three different sample sizes: 100, 200, 500; three different censoring percentages: 10%, 20%, 40%; and two different sets of mixing probabilities: ascending (10%, 40%, 50%) and descending (50%, 30%, 20%). Several sets of real data were used in the empirical study and models comparisons were implemented. Model 1 was compared with pure classical parametric survival model, two and four components parametric survival mixture models of Gamma distribution, respectively. Model 2 was compared with pure classical parametric survival models and three components parametric survival mixture models of the same distribution. Graphical presentations, log likelihood (LL), Akaike Information Criterion (AIC), Mean Square Error (MSE) and Root Mean Square Error (RMSE) were used to evaluate the performance. Simulation findings revealed that both models performed well at large sample size, small percentage of censoring and ascending mixing probabilities. Both models also produced smaller errors compared to other type of survival models in the empirical study. These indicate that both of the developed models are more accurate and provide better option to analyse heterogeneous survival data.
format Thesis
qualification_name Ph.D.
qualification_level Doctorate
author Mohammed, Yusuf Abbakar
author_facet Mohammed, Yusuf Abbakar
author_sort Mohammed, Yusuf Abbakar
title Parametric mixture model of three components for modelling heterogeneos survival data
title_short Parametric mixture model of three components for modelling heterogeneos survival data
title_full Parametric mixture model of three components for modelling heterogeneos survival data
title_fullStr Parametric mixture model of three components for modelling heterogeneos survival data
title_full_unstemmed Parametric mixture model of three components for modelling heterogeneos survival data
title_sort parametric mixture model of three components for modelling heterogeneos survival data
granting_institution Universiti Utara Malaysia
granting_department Awang Had Salleh Graduate School of Arts & Sciences
publishDate 2015
url https://etd.uum.edu.my/6095/1/s93379_01.pdf
https://etd.uum.edu.my/6095/2/s93379_02.pdf
_version_ 1747828020535623680
spelling my-uum-etd.60952021-04-04T07:44:24Z Parametric mixture model of three components for modelling heterogeneos survival data 2015 Mohammed, Yusuf Abbakar Yatim, Bidin Ismail, Suzilah Awang Had Salleh Graduate School of Arts & Sciences Awang Had Salleh Graduate School of Arts and Sciences QA273-280 Probabilities. Mathematical statistics Previous studies showed that two components of survival mixture model performed better than pure classical parametric survival model. However there are crucial needs for three components of survival mixture model due to the behaviour of heterogeneous survival data which commonly comprises of more than two distributions. Therefore in this study two models of three components of survival mixture model were developed. Model 1 is three components of parametric survival mixture model of Gamma distributions and Model 2 is three components of parametric survival mixture model of Exponential, Gamma and Weibull distributions. Both models were estimated using the Expectation Maximization (EM) and validated via simulation and empirical studies. The simulation was repeated 300 times by incorporating three different sample sizes: 100, 200, 500; three different censoring percentages: 10%, 20%, 40%; and two different sets of mixing probabilities: ascending (10%, 40%, 50%) and descending (50%, 30%, 20%). Several sets of real data were used in the empirical study and models comparisons were implemented. Model 1 was compared with pure classical parametric survival model, two and four components parametric survival mixture models of Gamma distribution, respectively. Model 2 was compared with pure classical parametric survival models and three components parametric survival mixture models of the same distribution. Graphical presentations, log likelihood (LL), Akaike Information Criterion (AIC), Mean Square Error (MSE) and Root Mean Square Error (RMSE) were used to evaluate the performance. Simulation findings revealed that both models performed well at large sample size, small percentage of censoring and ascending mixing probabilities. Both models also produced smaller errors compared to other type of survival models in the empirical study. These indicate that both of the developed models are more accurate and provide better option to analyse heterogeneous survival data. 2015 Thesis https://etd.uum.edu.my/6095/ https://etd.uum.edu.my/6095/1/s93379_01.pdf text eng public https://etd.uum.edu.my/6095/2/s93379_02.pdf text eng public Ph.D. doctoral Universiti Utara Malaysia Abu Bakar, M. Z., Daud, I, & Ibrahim, N. A. (2006). Estimating a logistic Weibull mixture models with long-Term survivors. Jurnal Tecknologi, 45(C) , 57-66. Abu -Zinadah, H. H. (2010). A study on mixture of exponentiated pareto and exponential distributions. Journal of Applied Sciences Research, 6(4), 358-376. Akaike, H. (1974). A new look at the statistical model identification. Automatic Control, IEEE Transactions on, 19(6), 716-723. Al-Hussaini, E. K., Al-Dayian, G. R., & Adham, S. A. (2000). On finite mixture of twocomponent Gompertz lifetime model. Journal of Statistical Computation and Simulation, 67(1), 1-20. Birnbaun, Z. W. & Saunders S. C. (1958).“A statistical model for life-length of materials”. Journal of the American Statistical Association. 53, 151-160. Blackstone, E. H., Naftel, D. C., & Turner, M. E. Jr. (1986). The decomposition of timevarying hazard into phases, each incorporating a separate stream of concomitant information. Journal of the American Statistical Association, 81(395), 615-624. Bohning, D., & Seidel, W. (2003). Editorial: recent developments in mixture models. Computational Statistics & Data Analysis, 41(3-4), 349-357. Brown, G. W., & Flood, M. M. (1947). Tumbler mortality. Journal of the American Statistical Association. 42, 562-574. Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: a practical information-theoretic approach. Springer. Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference understanding AIC and BIC in model selection. Sociological methods & research, 33(2), 261-304. Cai, C., Zou, Y., Peng, Y., Zhang, J., & Cai, M. C. (2012). Package ‘smcure’. Chang, S. C. (1998). Using parametric statistical models to estimate mortality structure: The case of Taiwan. Journal of Actuarial Practice, 6(1). Cheng, S. W., & Fu, J. C. (1982). Estimation of mixed Weibull parameters in life testing. Reliability, IEEE Transactions on, R-31(4), 377-381. Cohen, A. C., Jr. (1951). Estimating parameters of logarithmic-normal distributions by maximum likelihood. Journal of the American Statistical Association, 46(254), 206-212. Copas, J. B., & Heydari, F. (1997). Estimating the risk of reoffending by using exponential mixture models. Journal of the Royal Statistical Society. Series A (Statistics in Society), 160(2), 237-252. Davis, D. J. (1952). An analysis of some failure data. Journal of the American Statistical Association, 47(258), 113-150. Dempster, A. P. Laird, N. M., & Rubin, D. B.(1977). Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion)”. Journal of Royal Statistical Society. Series B, 39, 1-38. Epstein, B. & Sobel, M. (1953). Life testing. Journal of the American Statistical Association, 48, 486-502. Erisoglu, U., Erisoglu, M. & Erol, H. (2011). A mixture model of two different distributions approach to the analysis of heterogeneous survival data. International Journal of Computational and Mathematical Sciences 5(2). Erisoglu, U., Erisoglu, M., & Erol, H. (2012). Mixture model approach to the analysis of heterogeneous survival data. Pakistan Journal of Statistics 28(1), 115-130. Erişoğlu, Ü., & Erol, H. (2010). Modelling heterogeneous survival data using mixture of extended exponential-geometric distributions. Communications in Statistics - Simulation and Computation, 39(10), 1939-1952. Escobar, L. A., & Meeker, W. Q., Jr. (1992). Assessing Influence in Regression Analysis with Censored Data. Biometrics, 48(2), 507-528. doi: 10.2307/2532306. Everitt, B. S., & Hand, D. J., (1981). Finite mixture distributions. Chapman and Hall Inc. New York Farcomeni, A., & Nardi, A. (2010). A two-component Weibull mixture to model early and late mortality in a Bayesian framework. Computational Statistics & amp; Data Analysis, 54(2), 416-428. Farewell, V. T. (1982). The use of mixture models for the analysis of survival data with long-term survivors. Biometrics, 38(4), 1041-1046. Fraley, C., & Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97(458), 611-631. Fruhwirth-Schnatter, S. (2006). Finite mixture and markovs switching models. New York: Springer. Harter, H. L., & Moore, A. H. (1966). Local-maximum-likelihood estimation of the parameters of three-parameter Lognormal populations from complete and censored samples. Journal of the American Statistical Association, 61(315), 842-851. Ibrahim, J. G., Chen, M. H., & Sinha, D. (2001). Bayesian survival analysis. New York: Springer-verlag. Jaheen, Z. (2005). On record Statistics from a mixture of two exponential distributions. Journal of Statistical Computation & Simulation, 75(1), 1-11. Jensen, J. & Petersen, N. E. (1982). Burn-in: an engineering approach to the design and analysis of burn-in procedures, wiley , New york. Jewell, N. P. (1982). Mixtures of exponential distributions. The Annals of Statistics, 10(2), 479-484. Jiang, S. & Kececioglu, D (1992a). Graphical representation of two mixed-Weibull distributions. IEEE Transaction on Reliability, vol. 41,241-247. Jiang, S. & Kececioglu, D (1992b). Maximum likelihood estimates, from censored data, for mixed-Weibull distributions. IEEE Transaction on Reliability, vol. 41,248-255. Jiang, R., & Murthy, D. N. P. (1995). modelling failure-data by mixture of 2 Weibull distributions: a graphical approach. Reliability, IEEE Transactions on, 44(3), 477- 488.Jiang, R., & Murthy, D. N. P. A mixture model involving three weibull distributions, Proceedings of the Second Australia-Japan Workshop on Stochastic Models in Engineering. Technology and Management (Gold Coast, Australia). Kalbfleisch J. D. & Prentice R. L. (2002). The statistical analysis of failure time data (second ed.), John Wiley & Sons, Inc. Hoboken, New Jersey. Kaplan, E. L., & Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53(282), 457-481. Kersey, J. H., Weisdorf, D., Nesbit, M. E., LeBien, T. W., Woods, W. G., McGlave, P. B., ... Bostrom, B. (1987). Comparison of autologous and allogeneic bone marrow transplantation for treatment of high-risk refractory acute lymphoblastic leukemia. New England Journal of Medicine, 317(8), 461-467. Khalid, Z. M. & Morgan, J. T.(2008). Cross-sectional and longitudinal approaches in a survival mixture model, Matematika, Vol. 24, 231-242. Koti, K. M. (2001). Failure-time mixture models: yet another way to establish efficacy. Drug Information Journal, 35(4), 1253-1260. Kouassi, D. A. & Singh J. (1997). A semi-parametric approach to hazard estimation with randomly censored observations. Journal of American Statistical Association 92, pp.1351-1355. Kuk, A. Y. C., & Chen, C.-H. (1992). A mixture model combining logistic regression with proportional hazards regression. Biometrika, 79(3), 531-541. Larson, M. G., & Dinse, G. E. (1985). A mixture model for the regression analysis of competing risks data. Journal of the Royal Statistical Society. Series C (Applied Statistics), 34(3), 201-211. Lawless J. F. (2003). Statistical models and methods of lifetime data, (2nd ed.) John Wiley and Sons, Inc. Hoboken, New Jersey. Lee, E. T. & Wang, J. W.(2003). Statistical methods for survival data analysis (3rd ed.). John Wiley & son. Leisch, F. (2004). Exploring the structure of mixture model components. In J Antoch (ed.), “Compstat 2004- proceedings in Computational Statistics”, pp. 1405-1412. Physica Verlag, Heidelberg. ISBN 3-7908-1554-3. Leng, O. Y., & Khalid, Z. M. (2010). A comparative study of maximum likelihood and Bayesian estimation approaches in estimating frailty mixture survival model parameters. Paper presented at the Proceedings of the 6th IMT-GT Conference on Mathematics, Statistics and its Applications (ICMSA2010), Universiti Tunku Abdul Rahman, Kuala Lumpur, Malaysia. Li, L., & Choe, M. K. (1997). A mixture model for duration data: analysis of second births in China. Demography, 34(2), 189-197. Ling, D., Huang, H.-Z., & Liu, Y. (26, 26-29 Jan. 2009). A method for parameter estimation of mixed Weibull distribution. Paper presented at the Reliability and Maintainability Symposium, 2009. RAMS 2009. Annual. Marín, J. M., Rodríguez-Bernal, M. T., & Wiper, M. P. (2005). Using Weibull mixture distributions to model heterogeneous survival data. Communications in Statistics: Simulation and Computation, 34(3), 673-684. McGilchrist, C. A., & Aisbett, C. W. (1991). Regression with frailty in survival analysis. Biometrics, 47, 461-466. McLachlan, G. J., & Peel, D. (2000). Finite mixture models: John Wiley & Sons, Inc. McLachlan, G. J., & Krishnan, T. (2008). The EM algorithm and extensions (Second ed.). Hoboken New Jersey: John Wiley & Sons, Inc. Moltoft, J. (1983). Behind the “bathtub” curve, a new model and its consequences, Microeclectonics & Reliability, 23, 489-500. Murthy D. N. P., Xie, M. & Jiang, R. (2004). Weibull models. John Wiley & son. Ng, A. S. K., McLachlan, G. J., Yau, K. K. W., & Lee, A. H. (2004). Modelling the distribution of ischaemic stroke-specific survival time using an EM-based mixture approach with random effects adjustment. Statistics in Medicine, 23(17), 2729-2744. Olkin, I., & Spiegelman, C. H.(1987). A semi-parametric approach to density estimation, Journal of the American Statistical Association, 82, 858-865. Othus, M. Li, Y & Tiwari, R. C. (2009). A class of semi-paramertic mixture cure survival models with dependent censoring. Journal of American Statistical Association, 104(487). 1241-1250. Phillips, N., Coldman, A., & McBride, M. L. (2002). Estimating cancer prevalence using mixture models for cancer survival. Statistics in Medicine, 21(9), 1257-1270. Razali, A. M., & Salih, A. A. (2009). Combining two Weibull distributions using a mixing parameter. European Journal of Scientific Research, 31(2), 296-305. Rider, P. R. (1961). The method of moments applied to a mixture of two exponential distributions. The Annals of Mathematical Statistics, 32(1), 143-147. Seppa, K. Hakulinen, T., Kim, J. J. & Laara, E. (2010). Cure fraction model with random effects for regional variation in cancer survival. Statistics in Medicine, 29. 2781-2793. Sultan, K. S., Ismail, M. A., & Al-Moisheer, A. S. (2007). Mixture of two inverse Weibull distributions: properties and estimation. Computational Statistics & Data Analysis, 51(11), 5377-5387. Sun, J. (2006). The statistical analysis of interval-cencored failure time data. New York: Springer Science, Business Media. Tableman, M., & Kim, J. S. (2004). Survival analysis using S: analysis of time- to-event data: Chapman & Hall/CRC. Taylor, J. M. G. (1995). Semi-parametric estimation in failure time mixture models. Biometrics, 51(3), 899-907. Team, R. C. (2005). R: A language and environment for statistical computing: ISBN 3-900051-07-0. R Foundation for Statistical Computing. Vienna, Austria, 2013. url:http://www. R-project. org. Therneau, T. (2013). A Package for Survival Analysis in S. R package version 2.37-4. Retrieved from http://CRAN.R-project.org/package=survival Tukey, J. W. (1977) Explanatory data analysis. Addison Wesley publishing company Inc. Philippines. Vernic, R., Teodorescu, S., & Pelican, E. (2009). Two Lognormal models for real data. Annals of Statistics Ovidius Constanta, 17(3), 263-279. Weibull, W. (1939). A statistical theory of strength of materials. Ingeniorsvetens Kapsakadeniens Handlingar. Weibull, & W. (1951). A statistical distribution function for wide applicability. Journal of Applied Mathematics(18), 293-297. Wiper, M., Insua, D. R., & Ruggeri, F. (2001). Mixtures of gamma distributions with applications. Journal of Computational and Graphical Statistics, 10(3). Young, D. S., Benaglia, T., Chauveau, D., Hunter, D. R., Elmore, R. T., Xuan, F., ... & Thomas, H. (2007). The mixtools package: tools for mixture models. R Package Version 0.2. 0. Yu, B., & Peng, Y. (2008). “Mixture cure models for multivariate survival time data” Computational Statistics & Data Analysis, 52, 1524-1532. Zelen, M. (1966). Application of exponential models to problems in cancer research. Journal of the Royal Statistical Society. Series A (General), 129(3), 368-398. Zhang Y. (2008). Parametric mixture models in survival analysis with application, (Doctoral Dissertation) UMI Number: 3300387, Graduate School, Temple University. Zhang, X., Wang, Y., & Lu, D. (2011, 26-28 July 2011). A new algorithm for parameters estimations of multivariate mixed Weibull distributions with censoring data. Paper presented at the 2011 International Conference on Multimedia Technology, ICMT 2011.