An Improved Diabetes Risk Prediction Framework : An Indonesian Case Study

Lack of diagnosis for diabetes often transpire in some ASEAN countries with relatively diminutive doctor to patient ratio.Essentially,it is believed that a systematic framework to predict diabetes risk factors is crucial for refining diagnostics and improving accuracy. However,there is the issue of...

Full description

Saved in:
Bibliographic Details
Main Author: Sutanto, Daniel Hartono
Format: Thesis
Language:English
English
Published: 2018
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/23340/1/An%20Improved%20Diabetes%20Risk%20Prediction%20Framework%20An%20Indonesian%20Case%20Study.pdf
http://eprints.utem.edu.my/id/eprint/23340/2/An%20Improved%20Diabetes%20Risk%20Prediction%20Framework%20An%20Indonesian%20Case%20Study.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utem-ep.23340
record_format uketd_dc
institution Universiti Teknikal Malaysia Melaka
collection UTeM Repository
language English
English
topic Q Science (General)
QA76 Computer software
spellingShingle Q Science (General)
QA76 Computer software
Sutanto, Daniel Hartono
An Improved Diabetes Risk Prediction Framework : An Indonesian Case Study
description Lack of diagnosis for diabetes often transpire in some ASEAN countries with relatively diminutive doctor to patient ratio.Essentially,it is believed that a systematic framework to predict diabetes risk factors is crucial for refining diagnostics and improving accuracy. However,there is the issue of noisy dataset detected as incomplete data and the outlier class problem that affects sampling bias.Existing frameworks were deemed difficult in identifying the critical risk factors of diabetes;some of which were considerably inaccurate and consume substantial computation time.The purpose of this study is to develop a suitable framework for predicting diabetes risks.From a complete blood test,the framework can predict and classify the output of either having diabetes risk or no diabetes risk.A Diabetes Risk Prediction Framework (DRPF) was developed from the literature review and case studies were afterwards conducted in three private hospitals in Semarang.Analyses were conducted to find a suitable component of the framework—due to lack of comparison and analysis on the combination of feature selection and classification algorithm.DRPF comprises four main sections: pre-processing,outlier detection,risk weighting,and learning. Pre-processing resolves the issue of missing data and hence normalizes the data.Outlier treatment employs k-mean clustering to validate the class.Suitable components were selected through comparison of classifier algorithms and feature selection.Attribute weighting based feature selection was selected for assigning weightage.Weighted risk factor was used on training dataset in order to improve accuracy and computation time of the prediction. In the learning section,Support Vector Machine and Artificial Neural Network were selected as suitable classification algorithms,while Gradient Boosted Tree was employed to interpret the rule based on the black box classifiers.Testing the framework involved Pima Indian Dataset as public dataset and Semarang Hospital Dataset as private dataset (800 patients’ data).In validating the DRPF,four case studies investigated Subject Matter Expert (SME) groups based on the agreement level.The questionnaire consists of a DRPF component,implementation of DRPF,and viability of DRPF.DRPF components were validated by the SMEs,whereby the group ascertained five highest risk factors:HbA1c,systole/diastole,blood glucose,and creatinine and blood urea nitrogen that were assigned by attribute weighting.Results from the questionnaire revealed an average agreement level of 80%. In conclusion,DRPF is implementable as prototype and has been highly accepted by Indonesian practitioners as aid for the diagnostics of diabetes.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Sutanto, Daniel Hartono
author_facet Sutanto, Daniel Hartono
author_sort Sutanto, Daniel Hartono
title An Improved Diabetes Risk Prediction Framework : An Indonesian Case Study
title_short An Improved Diabetes Risk Prediction Framework : An Indonesian Case Study
title_full An Improved Diabetes Risk Prediction Framework : An Indonesian Case Study
title_fullStr An Improved Diabetes Risk Prediction Framework : An Indonesian Case Study
title_full_unstemmed An Improved Diabetes Risk Prediction Framework : An Indonesian Case Study
title_sort improved diabetes risk prediction framework : an indonesian case study
granting_institution UTeM
granting_department Faculty Of Information And Communication Technology
publishDate 2018
url http://eprints.utem.edu.my/id/eprint/23340/1/An%20Improved%20Diabetes%20Risk%20Prediction%20Framework%20An%20Indonesian%20Case%20Study.pdf
http://eprints.utem.edu.my/id/eprint/23340/2/An%20Improved%20Diabetes%20Risk%20Prediction%20Framework%20An%20Indonesian%20Case%20Study.pdf
_version_ 1747834039038902272
spelling my-utem-ep.233402022-02-07T11:47:47Z An Improved Diabetes Risk Prediction Framework : An Indonesian Case Study 2018 Sutanto, Daniel Hartono Q Science (General) QA76 Computer software Lack of diagnosis for diabetes often transpire in some ASEAN countries with relatively diminutive doctor to patient ratio.Essentially,it is believed that a systematic framework to predict diabetes risk factors is crucial for refining diagnostics and improving accuracy. However,there is the issue of noisy dataset detected as incomplete data and the outlier class problem that affects sampling bias.Existing frameworks were deemed difficult in identifying the critical risk factors of diabetes;some of which were considerably inaccurate and consume substantial computation time.The purpose of this study is to develop a suitable framework for predicting diabetes risks.From a complete blood test,the framework can predict and classify the output of either having diabetes risk or no diabetes risk.A Diabetes Risk Prediction Framework (DRPF) was developed from the literature review and case studies were afterwards conducted in three private hospitals in Semarang.Analyses were conducted to find a suitable component of the framework—due to lack of comparison and analysis on the combination of feature selection and classification algorithm.DRPF comprises four main sections: pre-processing,outlier detection,risk weighting,and learning. Pre-processing resolves the issue of missing data and hence normalizes the data.Outlier treatment employs k-mean clustering to validate the class.Suitable components were selected through comparison of classifier algorithms and feature selection.Attribute weighting based feature selection was selected for assigning weightage.Weighted risk factor was used on training dataset in order to improve accuracy and computation time of the prediction. In the learning section,Support Vector Machine and Artificial Neural Network were selected as suitable classification algorithms,while Gradient Boosted Tree was employed to interpret the rule based on the black box classifiers.Testing the framework involved Pima Indian Dataset as public dataset and Semarang Hospital Dataset as private dataset (800 patients’ data).In validating the DRPF,four case studies investigated Subject Matter Expert (SME) groups based on the agreement level.The questionnaire consists of a DRPF component,implementation of DRPF,and viability of DRPF.DRPF components were validated by the SMEs,whereby the group ascertained five highest risk factors:HbA1c,systole/diastole,blood glucose,and creatinine and blood urea nitrogen that were assigned by attribute weighting.Results from the questionnaire revealed an average agreement level of 80%. In conclusion,DRPF is implementable as prototype and has been highly accepted by Indonesian practitioners as aid for the diagnostics of diabetes. 2018 Thesis http://eprints.utem.edu.my/id/eprint/23340/ http://eprints.utem.edu.my/id/eprint/23340/1/An%20Improved%20Diabetes%20Risk%20Prediction%20Framework%20An%20Indonesian%20Case%20Study.pdf text en public http://eprints.utem.edu.my/id/eprint/23340/2/An%20Improved%20Diabetes%20Risk%20Prediction%20Framework%20An%20Indonesian%20Case%20Study.pdf text en validuser http://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=113297 phd doctoral UTeM Faculty Of Information And Communication Technology 1. Advisor, T.R., 2006. Sample Size Table. Available at: http://researchadvisors.com/tools/SampleSize.htm [Accessed 1 June 2015]. 2. Aha, D.W. & Kibler, D., 1988. Heart Disease Data Set. Available at: https://archive.ics.uci.edu/ml/datasets/Heart+Disease [Accessed 15 December 2015]. 3. Ahmad, A., Mustapha, A., Zahadi, E.D., Masah, N. & Yahaya, N.Y., 2011. Comparison between Neural Networks against Decision Tree in Improving Prediction Accuracy for Diabetes Mellitus. Digital Information Processing and Communications. pp. 537–545. 4. Ahmad, F., Mat Isa, N.A., Hussain, Z., Osman, M.K. & Sulaiman, S.N., 2015. A GA-based feature selection and parameter optimization of an ANN in diagnosing breast cancer. Pattern Analysis and Applications, 18 (4), pp.861–870. 5. Ahsen, M.E., Boren, T.P., Singh, N.K., Misganaw, B., Mutch, D.G., Moore, K.N., Backes, F.J., McCourt, C.K., Lea, J.S., Miller, D.S., White, M.A. & Vidyasagar, M., 2017. Sparse feature selection for classification and prediction of metastasis in endometrial cancer. BMC Genomics, 18 (S3), p.233. 6. Almulhim, D.A. & Househ, M.S., 2012. A Perspective on the Influence of Health Policy on Health Technology Use within the Arab World. Journal of Health Informatics in Developing Countries , 6 (1), pp.375–384. 7. Ambarriani, A.S., 2015. Hospital Financial Performance in the Indonesian National Health Insurance Era. Review of Integrative Business and Economics Research, 4 (1), pp.367–379.205 8. American Diabetes Association, 2016. 2. Classification and diagnosis of diabetes. Diabetes Care, 39 (January), pp.S13–S22. 9. American Diabetes Association, 1999. Diabetes Mellitus: A Major Risk Factor for Cardiovascular Disease. Circulation, 100 (10), pp.1132–1133. 10. American Diabetes Association, 2011. Diagnosis and Classification of Diabetes Mellitus. Diabetes Care, 34 (Supplement_1), pp.S62–S69. 11. Anirudha, R.C., Kannan, R. & Patil, N., 2015. Genetic Algorithm Based Wrapper Feature Selection on Hybrid Prediction Model for Analysis of High Dimensional Data. 12. Annis, A., Caulder, M., Cook, M. & Duquette, D., 2005. Family History, Diabetes, and Other Demographic and Risk Factors Among Participants of the National Health and Nutrition Examination Survey 1999–2002. Preventing Chronic Disease, 2 (2), p.A19. 13. Arunanondchai, J. & Fink, C., 2006. Trade in health services in the ASEAN region. Health promotion international, 21 Suppl 1, pp.59–66. 14. Askarzadeh, A. & Rezazadeh, A., 2013. Artificial neural network training using a new efficient optimization algorithm. Applied Soft Computing, 13 (2), pp.1206–1213. 15. Atkinson, P.A., 2007. Handbook of Ethnography 1st editio., SAGE Publications Ltd. 16. Baheti, M., 2016. Study of Need and Framework of Expert Systems for Medical Diagnosis. National Conference on Recent Trends in Computer Science and Information Technology (NCRTCSIT-2016). pp. 45–48. 17. Balci, O., 2001. A methodology for certification of modeling and simulation applications. ACM Transactions on Modeling and Computer Simulation, 11 (4), pp.352–377.206 18. Barakat, N.H., Bradley, A.P., Member, S. & Barakat, M.N.H., 2010. Intelligible Support Vector Machines for Diagnosis of Diabetes Mellitus. IEEE Transactions on Information Technology in Biomedicine, 14 (4), pp.1114–1120. 19. Bashir, S., Qamar, U., Hassan, F. & Naseem, L., 2016. HMV : A medical decision support framework using multi-layer classifiers for disease prediction. Journal of Computational Science, 13, pp.10–25. 20. Bashir, S., Qamar, U. & Khan, F.H., 2016. IntelliHealth : A medical decision support application using a novel weighted multi-layer classifier ensemble framework. JOURNAL OF BIOMEDICAL INFORMATICS, 59, pp.185–200. 21. Beelmann, A., Petticrew, M. & Roberts, H., 2006. Systematic reviews in the social sciences. A practical guide. European Psychologist, 11 (3), pp.244–245. 22. Beheshti, Z., Shamsuddin, S.M.H., Beheshti, E. & Yuhaniz, S.S., 2014. Enhancement of artificial neural network learning using centripetal accelerated particle swarm optimization for medical diseases diagnosis. Soft Computing, 18 (11), pp.2253–2270. 23. Belciug, S. & Gorunescu, F., 2014. Error-correction learning for artificial neural networks using the Bayesian paradigm. Application to automated medical diagnosis. Journal of Biomedical Informatics, 52, pp.329–337. 24. Belhumeur, P.N., Hespanha, J.P. & Kriegman, D.J., 1997. Eigenfaces vs.~{Fisherfaces}: Recognition using class specific linear projection. Pami, 19 (7), pp.711–720. 25. Belle, V. Van & Lisboa, P., 2014. White box radial basis function classifiers with component selection for clinical prediction models. Artificial Intelligence in Medicine, 60 (1), pp.53– 64.207 26. Beloufa, F. & Chikh, M.A., 2013. Design of fuzzy classifier for diabetes disease using Modified Artificial Bee Colony algorithm. Computer Methods and Programs in Biomedicine, 112 (1), pp.92–103. 27. Berardi, V.L. & Zhang, G.Q., 1997. The effect of misclassificaton costs on neural network classifiers. Decision Sciences Institute, 1997 Annual Meeting, Proceedings, Vols 1-3, 30 (3), pp.364–366. 28. Berndtsson, M., Hansson, J., Olsson, B. & Lundell, B., 2008. Thesis Projects, London: Springer London. 29. Biernacki, P. & Waldford, D., 1981. Snowball sampling: Problems and techniques of chain referral sampling. 1981; 2: 141-63. Social Methods Research, 2, pp.141–163. 30. Bolón-Canedo, V., Sánchez-Maroño, N. & Alonso-Betanzos, A., 2013. A review of feature selection methods on synthetic data. Knowledge and Information Systems, 34 (3), pp.483– 519. 31. Bolón-Canedo, V., Sánchez-Maroño, N. & Alonso-Betanzos, A., 2011. Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset. Expert Systems with Applications, 38 (5), pp.5947–5957. 32. Boudreau, M., Gefen, D. & Straub, D.W., 2001. Validation in Information Systems Research: A State-of-the-Art Assessment. MIS Quarterly, 25 (1), pp.1–16. 33. Britten, N., 1999. Qualitative interviews in healthcare. C. Pope & N. Mays, eds. Qualitative research in health care. London: BMJ Books, pp. 11–19. 34. Buchanan, B.G. & Shortliffe, E.H., 1984. Rule-Based Expert Systems, Cabot, R.C., Harris, N.L., Shepard, J.-A.O., Ebeling, S.H., Ellender, S.M., Peters, C.C., Kratz, A., Ferraro, M., Sluss, P.M. & Lewandrowski, K.B., 2004. Normal Reference Laboratory Values. New England Journal of Medicine, 351 (15), pp.1548–1563. 35. Cade, W.T., 2008. Diabetes-Related Microvascular and Macrovascular Diseases in the Physical Therapy Setting. Physical Therapy, 88 (11), pp.1322–1335. 36. Campos, G.O., Zimek, A., Sander, J., Campello, R.J.G.B., Micenková, B., Schubert, E., Assent, I. & Houle, M.E., 2016. On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Mining and Knowledge Discovery, 30 (4), pp.891–927. 37. Castillo, O. & Melin, P., 2008. Type-2 Fuzzy Logic: Theory and Applications, Berlin, Heidelberg: Springer Berlin Heidelberg. 38. Chandrashekar, G. & Sahin, F., 2014. A survey on feature selection methods. Computers and Electrical Engineering, 40 (1), pp.16–28. 39. Chang, Y.-W. & Lin, C.-J., 2008. Feature ranking using linear SVM. WCCI2008 Workshop on Causality. pp. 53–64. 40. Chen, F., Deng, P., Wan, J., Zhang, D., Vasilakos, A. V. & Rong, X., 2015. Data mining for the internet of things: Literature review and challenges. International Journal of Distributed Sensor Networks, 2015 (i). 41. Chiha, M., Njeim, M. & Chedrawy, E.G., 2012. Diabetes and coronary heart disease: A risk factor for the global epidemic. International Journal of Hypertension, 2012. 42. Chikh, M.A., Saidi, M. & Settouti, N., 2012. Diagnosis of diabetes diseases using an209 Artificial Immune Recognition System2 (AIRS2) with fuzzy K-nearest neighbor. Journal of Medical Systems, 36 (5), pp.2721–2729. 43. Chris, F., 2013. HbA1c as a Diagnostic Test for Diabetes Mellitus – Reviewing the Evidence. The Clinical Biochemist Reviews, 34 (2), pp.75–83. 44. Chuang, L.-Y., Yang, C.-H., Wu, K.-C. & Yang, C.-H., 2011. A hybrid feature selection method for DNA microarray data. Computers in Biology and Medicine, 41 (4), pp.228–237. 45. Colditz, G.A., 1995. Weight Gain as a Risk Factor for Clinical Diabetes Mellitus in Women. Annals of Internal Medicine, 122 (7), p.481. 46. Collins, G.S., Mallett, S., Omar, O. & Yu, L.M., 2011. Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC Medicine, 9, p.103. 47. Creswell, J.W., 2014. Research Design: Qualitative, Quantitative, and Mixed Methods Approaches 4th editio., Sage Publications, Inc. 48. Dans, A., Ng, N., Varghese, C., Tai, E.S., Firestone, R. & Bonita, R., 2011. The rise of chronic non-communicable diseases in southeast Asia: time for action. Lancet, 377 (9766), pp.680–689. 49. Dash ’, M. & Liu, H., 1997. Feature Selection for Classification. IDA ELSEVlER Intelligent Data Analysis, 1 (97), pp.131–156. 50. Dawson, C., 2009. Introduction to research methods: A practical guide for anyone undertaking a research project 4th ed., Hachette UK. 51. Delen, D., Walker, G. & Kadam, A., 2005. Predicting breast cancer survivability: a210 comparison of three data mining methods. Artificial Intelligence in Medicine, 34 (2), pp.113–127. 52. Dell, R.B., Holleran, S. & Ramakrishnan, R., 2002. Sample Size Determination. ILAR Journal, 43 (4), pp.207–213. 53. Dempsey, P.C., Owen, N., Biddle, S.J.H. & Dunstan, D.W., 2014. Managing Sedentary Behavior to Reduce the Risk of Diabetes and Cardiovascular Disease. Current Diabetes Reports, 14 (9), p.522. 54. Demšar, J., 2006. Statistical Comparisons of Classifiers over Multiple Data Sets. The Journal of Machine Learning Research, 7, pp.1–30. 55. Dikko, M., 2016. Establishing Construct Validity and Reliability : Pilot Testing of a Qualitative Interview for Research in Takaful ( Islamic Insurance ). The Qualitative Report, 21 (3), pp.521–528. 56. Dooley, L.M., 2002. Case Study Research and Theory Building. Advances in Developing Human Resources, 4 (3), pp.335–354. 57. Dreyer, R.P., Wang, Y., Strait, K.M., Lorenze, N.P., D’Onofrio, G., Bueno, H., Lichtman, J.H., Spertus, J.A. & Krumholz, H.M., 2015. Gender Differences in the Trajectory of Recovery in Health Status Among Young Patients With Acute Myocardial Infarction: Results From the Variation in Recovery: Role of Gender on Outcomes of Young AMI Patients (VIRGO) Study. Circulation, 131 (22), pp.1971–1980. 58. Drouin, P., Blickle, J.F., Charbonnel, B., Eschwege, E., Guillausseau, P.J., Plouin, P.F., Daninos, J.M., N Balarac, J.P.S. & Diabetes, D.O.F., 2013. Diagnosis and classification of diabetes mellitus. Diabetes Care, 36 Suppl 1, pp.S67–S74.211 59. Eisenhardt, K.M., 1989. Building Theories from Case Study Research. The Academy of Management Review, 14 (4), pp.532–550. 60. Esfandiari, N., Babavalian, M.R., Moghadam, A.M.E. & Tabar, V.K., 2014. Knowledge discovery in medicine: Current issue and future trend. Expert Systems with Applications, 41 61. (9), pp.4434–4463. 62. Fahmy, T. & Aubry, A., 1998. XLstat. Société Addinsoft SARL. p. 40. 63. Fallahnezhad, M., Moradi, M.H. & Zaferanlouei, S., 2011. A Hybrid Higher Order Neural Classifier for handling classification problems. Expert Systems with Applications, 38 (1), pp.386–393. 64. Fix, E. & Jr, J.L.H., 1951. Discriminatory analysis-nonparametric discrimination: consistency properties. 65. Flanders, W.D. & Austin, H., 1986. Possibility of selection bias in matched case-control studies using friend controls. Am J Epidemiol, 124, pp.150–153. 66. Forghani, Y. & Sadoghi Yazdi, H., 2015. Fuzzy Min–Max Neural Network for Learning a Classifier with Symmetric Margin. Neural Processing Letters, 42 (2), pp.317–353. 67. Fowler, M.J., 2011. Microvascular and Macrovascular Complications of Diabetes. Clinical Diabetes, 29 (3), pp.116–122. 68. Gan, J.Q., Awwad Shiekh Hasan, B. & Tsui, C.S.L., 2014. A filter-dominating hybrid sequential forward floating search method for feature subset selection in high-dimensional space. International Journal of Machine Learning and Cybernetics, 5 (3), pp.413–423. 69. Ganesh Kumar, P., Aruldoss Albert Victoire, T., Renukadevi, P. & Devaraj, D., 2012.212 Design of fuzzy expert system for microarray data classification using a novel Genetic Swarm Algorithm. Expert Systems with Applications, 39 (2), pp.1811–1821. 70. Ghani, M.K.A., Bali, R.K., Naguib, R.N.G., Marshall, I.M. & Wickramasinghe, N.S., 2010. Critical analysis of the usage of patient demographic and clinical records during doctorpatient consultations: a Malaysian perspective. International Journal of Healthcare Technology and Management, 11 (1/2), pp.113–130. 71. Ghauri, P.N. & Grønhaug, K., 2005. Research methods in business studies : a practical guide, Financial Times Prentice Hall. 72. Ghazavi, S.N. & Liao, T.W., 2008. Medical data mining by fuzzy modeling with selected features. Artificial Intelligence in Medicine, 43 (3), pp.195–206. 73. Gill, P., Stewart, K., Treasure, E. & Chadwick, B., 2008. Methods of data collection in qualitative research: interviews and focus groups. British Dental Journal, 204 (6), pp.291– 5. 74. Glazier, R.H., Bajcar, J., Kennie, N.R. & Willson, K., 2006. A systematic review of interventions to improve diabetes care in socially disadvantaged populations. Diabetes Care, 29 (7), pp.1675–1688. 75. Gorunescu, F., 2011. Data Mining, Berlin, Heidelberg: Springer Berlin Heidelberg. 76. Grambsch, P.M. & Therneau, T.M., 1994. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika, 81 (3), pp.515–526. 77. Gray, D.E., 2006. Doing Research in The Real World. International Journal of Social Research Methodology, 9 (4), pp.345–350.213 78. Grundy, S.M. et al., 2002. Third Report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III) final report. Circulation, 106 (25), pp.3143–421. 79. Guest, G., 2006. How Many Interviews Are Enough?: An Experiment with Data Saturation and Variability. Field Methods, 18 (1), pp.59–82. 80. Gürbüz, E. & Kılıç, E., 2014. A new adaptive support vector machine for diagnosis of diseases. Expert Systems, 31 (5), pp.389–397. 81. Guyon, I., 2003. An Introduction to Variable and Feature Selection 1 Introduction. Journal of Machine Learning Research, 3, pp.1157–1182. 82. Haffner, S.M., 1998. Epidemiology of Type 2 Diabetes: Risk Factors. Diabetes Care, 21 (Supplement_3), pp.C3–C6. 83. Hair, J.F., Black, W.C., Babin, B.J. & Anderson, R.E., 2009. Multivariate Data Analysis 7th ed., Prentice Hall. 84. Han, J., Kamber, M. & Pei, J., 2012. Data Mining Concepts and Techniques 3rd ed., Morgan Kaufmann Publishers Inc. 85. Han, J., Rodriguez, J.C. & Beheshti, M., 2008. Diabetes Data Analysis and Prediction Model Discovery Using RapidMiner. 2008 Second International Conference on Future Generation Communication and Networking. IEEE, pp. 96–99. 86. Han, J., Rodriguez, J.C. & Beheshti, M., 2009. Discovering decision tree based diabetes prediction model. Communications in Computer and Information Science. pp. 99–109. 87. Han, L., Luo, S., Yu, J., Pan, L. & Chen, S., 2015. Rule extraction from support vector214 machines using ensemble learning approach: An application for diagnosis of diabetes. IEEE Journal of Biomedical and Health Informatics, 19 (2), pp.728–734. 88. Hastie, T., Tibshirani, R. & Friedman, J., 2009. Boosting and Additive Trees. The Elements of Statistical Learning. pp. 337–387. 89. Hayashi, Y. & Yukita, S., 2016. Rule Extraction Using Recursive-Rule Extraction Algorithm with J48graft Combined with Sampling Selection Techniques for the Diagnosis of Type 2 Diabetes Mellitus in the Pima Indian Dataset. Informatics in Medicine Unlocked, 2, pp.92– 104. 90. Herland, M., Khoshgoftaar, T.M. & Wald, R., 2014. A review of data mining using big data in health informatics. Journal Of Big Data, 1 (1), p.2. 91. Hofmann, M. & Klinkenberg, R., 2013. RapidMiner: Data mining use cases and business analytics applications, CRC Press. 92. Hossein, R., Mostafa, L. & Leila, S., 2016. Conceptual Framework for Developing a Diabetes Information Network. Acta Informatica Medica, 24 (3), pp.186–192. International Diabetes Federation, 2014. Diabetes in Indonesia. Available at: http://www.idf.org/membership/wp/indonesia [Accessed 10 January 2015]. International Diabetes Federation, 2015. Risk Prediction Tools. Available at: http://www.idf.org/epidemiology/risk-prediction [Accessed 5 May 2015]. 93. Jerez, J.M., Molina, I., García-Laencina, P.J., Alba, E., Ribelles, N., Martín, M. & Franco, L., 2010. Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artificial Intelligence in Medicine, 50 (2), pp.105–115.215 94. Jonnagaddala, J., Liaw, S.-T., Ray, P., Kumar, M., Dai, H.-J. & Hsu, C.-Y., 2015. Identification and Progression of Heart Disease Risk Factors in Diabetic Patients from Longitudinal Electronic Health Records. BioMed Research International, 2015, pp.1–10. 95. Jose, P., 2006. Increase in Creatinine and Cardiovascular Risk in Patients with Systolic Dysfunction after Myocardial Infarction. Journal of the American Society of Nephrology, 17 (10), pp.2886–2891. 96. K. Park, 2015. Park’s textbook of preventive and social medicine Twenty-thi., Banarsidas Bhanot Publishers. 97. Kamadi, V.S.R.P.V., Allam, A.R., Thummala, S.M. & P., V.N.R., 2016. A computational intelligence technique for the effective diagnosis of diabetic patients using principal component analysis (PCA) and modified fuzzy SLIQ decision tree approach. Applied Soft Computing, 49, pp.137–145. 98. Kamber, J.H. and M., 2006. Data Mining Concepts and Techniques, 99. Kang, M., Kim, J. & Kim, J.-M., 2015. Reliable fault diagnosis for incipient low-speed bearings using fault feature analysis based on a binary bat algorithm. Information Sciences, 294, pp.423–438. 100. Karamizadeh, S., Abdullah, S.M., Halimi, M., Shayan, J. & Rajabi, M. javad, 2014. Advantage and drawback of support vector machine functionality. 2014 International Conference on Computer, Communications, and Control Technology (I4CT). IEEE, pp. 63– 65. 101. Kavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N., Vlahavas, I. & Chouvarda, I., 2017. Machine Learning and Data Mining Methods in Diabetes Research. Computational and216 Structural Biotechnology Journal, 15, pp.104–116. 102. Kawamoto, K., Hongsermeier, T., Wright, A., Lewis, J., Bell, D.S. & Middleton, B., 2013. Key principles for a national clinical decision support knowledge sharing framework: synthesis of insights from leading subject matter experts. Journal of the American Medical Informatics Association, 20 (1), pp.199–207. 103. Khan, F.S., Maqbool, F., Razzaq, S., Irfan, K. & Zia, T., 2008. The Role of Medical Expert Systems in Pakistan. World Academy of Science, Engineering and Technology, 2 (1), pp.280–282. 104. Khashei, M., Eftekhari, S. & Parvizian, J., 2012. Diagnosing Diabetes Type II Using a Soft Intelligent Binary Classification Model. Review of Bioinformatics and Biometrics, 1 (1), pp.9–23. 105. Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J. & Linkman, S., 2009. Systematic literature reviews in software engineering – A systematic literature review. Information and Software Technology, 51 (1), pp.7–15. 106. Krejcie, R. V & Morgan, D.W., 1970. Determining Sample Size for Research Activities. Educational and Psychological Measurement, 38, pp.607–610. 107. Kriegel, H., Kröger, P. & Zimek, A., 2010. Outlier Detection Techniques. 108. Krishnaiah, V., Narsimha, G. & Subhash, N., 2016. Heart Disease Prediction System using Data Mining Techniques and Intelligent Fuzzy Approach: A Review. International Journal of Computer Applications, 136 (2), pp.43–51. 109. Kurt, I., Ture, M. & Kurum, A.T., 2008. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. 110. Expert Systems with Applications, 34 (1), pp.366–374. 111. Kusy, M. & Zajdel, R., 2014. Probabilistic neural network training procedure based on Q(0)-learning algorithm in medical data classification. Applied Intelligence, 41 (3), pp.837–854. 112. Laing, S.P., Swerdlow, A.J., Slater, S.D., Burden, A.C., Morris, A., Waugh, N.R., Gatling, 113. W., Bingley, P.J. & Patterson, C.C., 2003. Mortality from heart disease in a cohort of 23,000 patients with insulin-treated diabetes. Diabetologia, 46 (6), pp.760–765. 114. Laurie, J. a., Moertel, C.G., Fleming, T.R., Wieand, H.S., Leigh, J.E., Rubin, J., McCormack, G.W., Gerstner, J.B., Krook, J.E., Malliard, J., Twito, D.I., Morton, R.F., Tschetter, L.K. & Barlow, J.F., 1989. Surgical adjuvant therapy of large-bowel carcinoma: An evaluation of levamisole and their combination of levamisole and fluorouracil. Journal of Clinical Oncology, 7 (10), pp.1447–1456. 115. Lee, C.H. & Yoon, H.-J., 2017. Medical big data: promise and challenges. Kidney Research and Clinical Practice, 36 (1), pp.3–11. 116. Lessmann, S., Baesens, B., Mues, C. & Pietsch, S., 2008. Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings. IEEE Transactions on Software Engineering, 34 (4), pp.485–496. 117. Lewis, C.P. & Newell, J.N., 2014. Patients’ perspectives of care for type 2 diabetes in Bangladesh -a qualitative study. BMC Public Health, 14, p.737. 118. Li, D.-C., Liu, C.-W. & Hu, S.C., 2011. A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. Artificial Intelligence in Medicine, 52 (1), pp.45–52.218 119. Li, D.-C., Liu, C.-W. & Hu, S.C., 2010. A learning method for the class imbalance problem with medical data sets. Computers in Biology and Medicine, 40 (5), pp.509–518. 120. Li, X., Liu, H., Du, X., Hu, G., Xie, G. & Zhang, P., 2016. Using Frequent Item Set Mining and Feature Selection Methods to Identify Interacted Risk Factors - The Atrial Fibrillation Case Study. Studies in Health Technology and Informatics, 228, pp.562–6. 121. Likert, R., 1932. A technique for the measurement of attitudes. Archives of Psychology. 122. Lopes, C.S., Rodrigues, L.C. & Sichieri, R., 1996. The Lack of Selection Bias in a Snowball Sampled Case-Control Study on Drug Abuse. International Journal of Epidemiology, 25 (6), pp.1267–1270. 123. Luukka, P., 2011. Feature selection using fuzzy entropy measures with similarity classifier. Expert Systems with Applications, 38 (4), pp.4600–4607. 124. Luukka, P., 2007. Similarity classifier using similarity measure derived from Yu’s norms in classification of medical data sets. Computers in Biology and Medicine, 37 (8), pp.1133–40. 125. Lynn, M.R., 1986. Determination and quantification of content validity. Nursing Research, 35 (6), pp.382–385. 126. MacQueen, J.B., 1967. Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. University of California Press, Berkeley, pp. 281 – 297. 127. Madjid, M., Awan, I., Willerson, J.T. & Casscells, S.W., 2004. Leukocyte count and coronary heart disease: Implications for risk assessment. Journal of the American College of Cardiology, 44 (10), pp.1945–1956.219 128. Manos, B., Bournaris, T., Silleos, N., Antonopoulos, V. & Papathanasiou, J., 2004. A Decision Support System Approach for Rivers Monitoring and Sustainable Management. Environmental Monitoring and Assessment, 96 (1–3), pp.85–98. 129. Marinaki, M. & Marinakis, Y., 2014. A bumble bees mating optimization algorithm for the feature selection problem. International Journal of Machine Learning and Cybernetics, pp.1–20. 130. Mat Isa, N.A. & Mamat, W.M.F.W., 2011. Clustered-Hybrid Multilayer Perceptron network for pattern recognition application. Applied Soft Computing, 11 (1), pp.1457–1466. 131. May, K.N., 1991. Interview techniques in qualitative research: concerns and challenges. J. 132. M. Morse, ed. Qualitative nursing research. Newbury Park: Sage Publications, pp. 187–201. 133. McDonald, M. & Pickart, F., 2011. The Global Burden of Noncommunicable Diseases. Available at: https://www.pfizer.pt/ [Accessed 9 March 2015]. 134. Mitchell, T.M., 1997. Does Machine Learning Really Work? AI Magazine, 18 (3), pp.11– 20. 135. Montoya, J.C., Rebulanan, C.L., Parungao, N.A.C. & Ramirez, B., 2014. A look at the ASEAN-NDI : building a regional health R & D innovation network. Infectious Diseases of Poverty, 3 (15), pp.1–10. 136. Morrison, J., 2016. Diagnosis Made Easier, Principles and Techniques for Mental Health Clinicians: Psychology, Psychology 2nd ed., The Guilford Press. 137. Nai-Arun, N. & Moungmai, R., 2015. Comparison of Classifiers for the Risk of Diabetes Prediction. Procedia Computer Science, 69, pp.132–142.220 138. Nelson Ford, F., 1985. Decision support systems and expert systems: A comparison. Information & Management, 8 (1), pp.21–26. 139. Neuman, W.L., 2011. Social Research Methods: Qualitative and Quantitative Approaches 7th editio., University of Wisconsin. 140. Nguyen, Q.T., Naguib, R.N.G., Ghani, M.K.A., Bali, R.K., Marshall, I.M., Phuong, N.H., Culaba, A.B., Wickramasinghe, N.S., Shaker, M.H. & Lee, R.V., 2008. An analysis of the healthcare informatics and systems in Southeast Asia: a current perspective from seven countries. International Journal of Electronic Healthcare, 4 (2), p.184. 141. Nirmaladevi, M., Alias Balamurugan, S.A. & Swathi, U. V., 2013. An amalgam KNN to predict diabetes mellitus. 2013 IEEE International Conference on Emerging Trends in Computing, Communication and Nanotechnology, ICE-CCN 2013, (Iceccn), pp.691–695. Novo Nordisk, 2013. Changing Diabetes in Indonesia, 142. Olinsky, A., Kristin, K. & Brayton, K.B., 2012. Assessing Gradient Boosting in the Reduction of Misclassification Error in the Prediction of Success for Actuarial Majors. Case Studies In Business, Industry & Government Statistics, 5 (1), pp.12–16. 143. Olokoba, A.B., Obateru, O.A. & Olokoba, L.B., 2012. Type 2 Diabetes Mellitus: A Review of Current Trends. Oman Medical Journal, 27 (4), pp.269–273. 144. Ozcift, A. & Gulten, A., 2011. Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms. Computer Methods and Programs in Biomedicine, 104 (3), pp.443–451. 145. Papatheodorou, K., Banach, M., Edmonds, M., Papanas, N. & Papazoglou, D., 2015.221 Complications of Diabetes. Journal of Diabetes Research, 2015, pp.1–5. 146. Patil, B.M., Joshi, R.C. & Toshniwal, D., 2010. Hybrid prediction model for Type-2 diabetic patients. Expert Systems with Applications, 37 (12), pp.8102–8108. 147. Patrick, D.L., Burke, L.B., Gwaltney, C.J., Leidy, N.K., Martin, M.L., Molsen, E. & Ring, L., 2011. Content validity - Establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 2 - Assessing respondent understanding. Value in Health, 14 (8), pp.978–988. 148. Patton, M.Q., 2002. Qualitative Research & Evaluation Methods 3rd editio., Perreault, L.E. & Metzger, J.B., 1992. Ethical Principles of Psychologists and Code of Conduct. American Psychologist, 47 (12), pp.1597–1611. 149. Peter, S., 2014. An Analytical Study on Early Diagnosis and Classification of Diabetes Mellitus. , 4 (2), pp.7–11. 150. Piazza, G., Seddighzadeh, A. & Goldhaber, S.Z., 2008. Heart Failure in Patients With Deep Vein Thrombosis. The American Journal of Cardiology, 101 (7), pp.1056–1059. 151. Polat, K., Güneş, S. & Arslan, A., 2008. A cascade learning system for classification of diabetes disease: Generalized Discriminant Analysis and Least Square Support Vector Machine. Expert Systems with Applications, 34 (1), pp.482–487. 152. Purwar, A. & Singh, S.K., 2015. Hybrid prediction model with missing value imputation for medical data. Expert Systems with Applications, 42 (13), pp.5621 – 5631. 153. Repko, A.F., 2014. Interdisciplinary Research 1st ed. Cram101, ed.,222 154. Rowley, J., 2002. Using case studies in research. Management Research News, 25 (1), pp.16–27. 155. Rubin, D.B., 1976. Inference and missing data. Biometrika, 63 (3), pp.581–592. 156. Runeson, P. & Höst, M., 2009. Guidelines for conducting and reporting case study research in software engineering. Empirical Software Engineering, 14 (2), pp.131–164. 157. Sammut, C. & Webb, G.I., 2010. Encyclopedia of Machine Learning C. Sammut & G. I. Webb, eds., Boston, MA: Springer US. Sarma, P.R., 1990. Red Cell Indices, 158. Sarwar, A. & Sharma, V., 2013. Comparative analysis of machine learning techniques in prognosis of type II diabetes. Ai & Society, 29 (1), pp.123–129. 159. Saumure, K. & Given, L.M., 2008. Data Saturation : SAGE Research Methods. SAGE Research Methods. 160. Schatz, B.R. & Berlin, R.B., 2011. Healthcare Infrastructure, London: Springer London. 161. Schmidhuber, J., 2015. Deep learning in neural networks : An overview. Neural Networks, 61, pp.85–117. 162. Schmidt, C.O. & Kohlmann, T., 2008. When to use the odds ratio or the relative risk? International Journal of Public Health, 53 (3), pp.165–7. 163. Schwinger, R.H. & Erdmann, E., 1992. Heart failure and electrolyte disturbances. Methods and findings in experimental and clinical pharmacology, 14 (4), pp.315–25. 164. Seera, M. & Lim, C.P., 2014. A hybrid intelligent system for medical data classification.Expert Systems with Applications, 41 (5), pp.2239–2249. 165. Sekaran, U. & Bougie, R., 2013. Research methods for business : a skill-building approach, Wiley. 166. Sekaran, U. & Bougie, R., 2009. Research Methods for Business: A Skill Building Approach 5th Editio., John Wiley & Sons Ltd. 167. Selvin, E., Steffes, M.W., Zhu, H., Matsushita, K., Wagenknecht, L., Pankow, J., Coresh, J. & Brancati, F.L., 2010. Glycated hemoglobin, diabetes, and cardiovascular risk in nondiabetic adults. The New England Journal of Medicine, 362 (9), pp.800–811. 168. Seuring, T., Archangelidi, O. & Suhrcke, M., 2015. The Economic Costs of Type 2 Diabetes: A Global Systematic Review. PharmacoEconomics, 33 (8), pp.811–831. 169. Shani, G., Chickering, M. & Meek, C., 2008. Mining recommendations from the web. Proceedings of the 2008 ACM Conference on Recommender Systems - RecSys ’08. New York, New York, USA: ACM Press, p. 35. 170. Shantsila, E. & Lip, G.Y.H., 2014. Thrombotic Complications in Heart Failure: An Underappreciated Challenge. Circulation, 130 (5), pp.387–389. 171. Shi, J., Zhang, S. & Qiu, L., 2013. Credit scoring by feature-weighted support vector machines. Journal of Zhejiang University SCIENCE C, 14 (3), pp.197–204. 172. Sigillito, V., 1990. Pima Indians Diabetes Database. UCI Machine Learning Repository. Available at: https://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes [Accessed 20 December 2014]. 173. Silva, L.M., Marques de Sá, J. & Alexandre, L.A., 2008. Data classification with multilayey perceptrons using a generalized error function. Neural Networks, 21 (9), pp.1302–1310. 174. Singh, T.P., Neagu, N., Quattrone, M. & Briet, P., 2015. Operations Research and Enterprise Systems. Communications in Computer and Information Science, 509, pp.265–284. 175. Soewondo, P., Ferrario, A. & Tahapary, D., 2013. Challenges in diabetes management in Indonesia: a literature review. Globalization and Health, 9 (1), p.63. 176. Steyerberg, E.W., Vickers, A.J., Cook, N.R., Gerds, T., Gonen, M., Obuchowski, N., Pencina, M.J. & Kattan, M.W., 2010. Assessing the Performance of Prediction Models. Epidemiology, 21 (1), pp.128–138. 177. Straub, D.W., 1989. Validating Instruments in MIS Research. MIS Quarterly, 13 (2), p.147. 178. Tama, B.A. & Rodiyatul, F.S., 2011. An Early Detection Method of Type-2 Diabetes Mellitus in Public Hospital. Telkomnika, 9 (2), pp.287–294. 179. Tang, F. & Tao, H., 2007. Fast linear discriminant analysis using binary bases. Pattern Recognition Letters, 28, pp.2209–2218. 180. Tang, J., Alelyani, S. & Liu, H., 2014. Feature Selection for Classification: A Review. Data Classification: Algorithms and Applications, pp.37–64. 181. Teddlie, C. & Yu, F., 2007. Mixed Methods Sampling. Journal of Mixed Methods Research, 1 (1), pp.77–100. 182. Temurtas, H., Yumusak, N. & Temurtas, F., 2009. A comparative study on diabetes disease diagnosis using neural networks. Expert Systems with Applications, 36 (4), pp.8610–8615. 183. Trochim, W. & Donnelly, J.P., 2006. The Research Methods Knowledge Base 3rd editio.,225 184. Upadhyaya, S., Farahmand, K. & Baker-Demaray, T., 2013. Comparison of NN and LR 185. classifiers in the context of screening native American elders with diabetes. Expert Systems with Applications, 40 (15), pp.5830–5838. 186. Uzer, M.S., Yilmaz, N. & Inan, O., 2013. Feature Selection Method Based on Artificial Bee Colony Algorithm and Support Vector Machines for Medical Datasets Classification. The Scientific World Journal, 2013, pp.1–10. 187. Valdés, J.J., Romero, E. & Barton, A.J., 2012. Data and knowledge visualization with virtual reality spaces, neural networks and rough sets: Application to cancer and geophysical prospecting data. Expert Systems with Applications, 39 (18), pp.13193–13201. 188. Valentini, G., Heiko Hamann & Marco Dorigo, 2015. Efficient Decision-Making in a SelfOrganizing Robot Swarm: On the Speed Versus Accuracy Trade-Off. Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems. pp. 1305– 1314. 189. Vapnik, V., Golowich, S.E. & Smola, A., 1998. Support Vector Method for Function Approximation, Regression Estimation, and Signal Processing. Advances in Neural Information Processing Systems 9. pp. 281–287. 190. Varma, K.V.S.R.P., Rao, A.A., Sita Maha Lakshmi, T. & Nageswara Rao, P.V., 2014. A computational intelligence approach for a better diagnosis of diabetic patients. Computers and Electrical Engineering, 40 (5), pp.1758–1765. 191. Wahono, R.S., Herman, N.S. & Ahmad, S., 2014. A Comparison Framework of Classification Models for Software Defect Prediction. Advanced Science Letters, 20, pp.1945–1950.226 192. Walsham, G., 1995. Interpretive case studies in IS research: nature and method. European Journal of Information Systems, 4 (2), pp.74–81. 193. Wang, J.T.L., Zaki, M.J., Toivonen, H.T.T. & Shasha, D., 2005. Introduction to Data Mining in Bioinformatics X. Wu et al., eds., London: Springer London. 194. Wannamethee, S.G., Shaper, A.G. & Perry, I.J., 1997. Serum Creatinine Concentration and Risk of Cardiovascular Disease : A Possible Marker for Increased Risk of Stroke. Stroke, 28 (3), pp.557–563. 195. WHO, 2010. Global status report on noncommunicable diseases. WHO Library Cataloguing-in-Publication Data, p.176. Available at: http://www.who.int/nmh/publications/ncd_report2010/en/ [Accessed 7 October 2014]. 196. WHO, 2015. Non-Communicable Diseases. Available at: http://www.who.int/mediacentre/factsheets/fs355/en/ [Accessed 10 January 2015]. 197. WHO, 2011. Noncommunicable Diseases Country Profiles. WHO Library Cataloguing-inPublication Data. Available at: http://www.who.int/nmh/publications/ncd_profiles2011/en/ [Accessed 8 January 2015]. 198. WHO, 2014. World Health Statistics. WHO Library Cataloguing-in-Publication Data. Available at: http://www.who.int/gho/publications/world_health_statistics/2014/en/ [Accessed 9 December 2014]. 199. Widjaja, M., 2012. Indonesia In Search of a Placement-Support Social Protection. ASEAN Economic Bulletin, 29 (3), p.184. 200. Wirth, R., 2000. CRISP-DM : Towards a Standard Process Model for Data Mining.227 Proceedings of the Fourth International Conference on the Practical Application of Knowledge Discovery and Data Mining, (24959), pp.29–39. 201. Witten, I.H., Frank, E. & Hall, M.A., 2006. Data Mining : Practical Machine Learning Tools and Techniques 3rd ed., The Morgan Kaufmann. 202. Wolberg, W.H., Mangasarian, O. & Aha, D.W., 1992. Breast Cancer Wisconsin (Original) Data Set. UCI Machine Learning Repository. Available at: http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29 [Accessed 16 November 2015]. 203. Wolberg, W.H., Street, W.N. & Mangasarian, O.L., 1992. Breast Cancer Wisconsin (Diagnostic) Data Set. UCI Machine Learning Repository. Available at: http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29 [Accessed 17 November 2015]. 204. Wu, X., Zhao, M., Pan, B., Zhang, J., Peng, M., Wang, L., Hao, X., Huang, X., Mu, R., Guo, 205. W., Qiao, R., Chen, W., Jiang, H., Ma, Y. & Shang, H., 2015. Complete Blood Count Reference Intervals for Healthy Han Chinese Adults R. L. Schmidt, ed. PLOS ONE, 10 (3), p.e0119669. 206. Wu, Y., Ding, Y., Tanaka, Y. & Zhang, W., 2014. Risk factors contributing to type 2 diabetes and recent advances in the treatment and prevention. International Journal of Medical Sciences, 11 (11), pp.1185–1200. 207. Wu, Y., Wu, Y., Wang, J., Yan, Z., Qu, L., Xiang, B. & Zhang, Y., 2011. An optimal tumor marker group-coupled artificial neural network for diagnosis of lung cancer. Expert Systems with Applications, 38 (9), pp.11329–11334.228 208. Yaghini, M., Khoshraftar, M.M. & Fallahi, M., 2013. A hybrid algorithm for artificial neural network training. Engineering Applications of Artificial Intelligence, 26 (1), pp.293–301. 209. Yan, X., 2009. Linear Regression Analysis: Theory and Computing, World Scientific. 210. Yayan, J., 2012. Erythrocyte sedimentation rate as a marker for coronary heart disease. Vascular Health and Risk Management, p.219. 211. Yilmaz, N., Inan, O. & Uzer, M.S., 2014. A new data preparation method based on clustering algorithms for diagnosis systems of heart and diabetes diseases. Journal of Medical Systems, 38 (48), pp.1–12. 212. Yin, R., 1994. Case Study Research: Design and Methods 2nd ed., Thousand Oaks, CA: Sage Publishing. 213. Yoon, H., Park, C.-S., Kim, J.S. & Baek, J.-G., 2013. Algorithm learning based neural network integrating feature selection and classification. Expert Systems with Applications, 40 (1), pp.231–241. 214. Zachman, J.A., 1987. A framework for information systems architecture. IBM Systems Journal, 26 (3), pp.276–292. 215. Zainuddin, M.F., 2006. K-Chart: A Tool for Research Planning and Monitoring. A Tool for Research Planning and Monitoring, 2 (1), pp.123–129. 216. Zhang, G.P., Patuwo, B.E. & Hu, M.Y., 2001. A simulation study of artificial neural networks for nonlinear time-series forecasting. Computers & Operations Research, 28, pp.381–396. 217. Zhang, Z., Dong, J., Luo, X., Choi, K.-S. & Wu, X., 2014. Heartbeat classification using disease-specific feature selection. Computers in Biology and Medicine, 46, pp.79–89. 218. Zhao, M., Fu, C., Ji, L., Tang, K. & Zhou, M., 2011. Feature selection and parameter optimization for support vector machines: A new approach based on genetic algorithm with feature chromosomes. Expert Systems with Applications, 38 (5), pp.5197–5204. 219. Zheng, T., Xie, W., Xu, L., He, X., Zhang, Y., You, M., Yang, G., Chen, Y. & Ph, D., 2017. A machine learning-based framework to identify type 2 diabetes through electronic health records. International Journal of Medical Informatics, 97, pp.120–127. 220. Zhou, B., Lu, Y., Hajifathalian, K., Bentham, J., Di Cesare, M., Cisneros, J.Z., et al., 2016. Worldwide trends in diabetes since 1980: A pooled analysis of 751 population-based studies with 4.4 million participants. The Lancet, 387 (10027), pp.1513–1530. 221. Zhu, J., Xie, Q. & Zheng, K., 2015. An improved early detection method of type-2 diabetes mellitus using multiple classifier system. Information Sciences, 292, pp.1–14