Development of geospatial model for tuberculosis prediction in Gombak, Selangor, Malaysia

Background: Tuberculosis (TB) cases have increased drastically over the last two decades and remains as one of the deadliest infectious diseases in Malaysia. Preventing and controlling the disease is not only depend on molecular epidemiology but there is also a need to explicitly understand spati...

Full description

Saved in:
Bibliographic Details
Main Author: Mohidem, Nur Adibah
Format: Thesis
Language:English
Published: 2021
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/103773/1/NUR%20ADIBAH%20BINTI%20MOHIDEM%20-%20IR.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background: Tuberculosis (TB) cases have increased drastically over the last two decades and remains as one of the deadliest infectious diseases in Malaysia. Preventing and controlling the disease is not only depend on molecular epidemiology but there is also a need to explicitly understand spatial epidemiology, which assesses the distribution of disease in different locations. However, there is a lack of studies clarifying the spatial evaluation of both sociodemographic and environmental factors with the TB cases in the country. Objective: This study utilized the geospatial technologies i) to investigate the trend and spatial pattern of TB cases; ii) to investigate the spatial distribution of TB cases and its association with the sociodemographic and environmental factors; iii) to develop the prediction model of TB cases; and iv) to develop a web-GIS application for plotting TB cases. Methodology: The sociodemographic data of 3325 cases of TB such as age, gender, race, nationality, country of origin, educational level, employment status, health care worker status, income status, residency, and smoking status from January 2013 to December 2017 in Gombak were collected from the MyTB web and Tuberculosis Information System (TBIS) file. Environmental data consisting of air pollution data such as air quality index (AQI), carbon monoxide (CO), nitrogen dioxide (NO2), sulphur dioxide (SO2), and particulate matter 10 (PM10) were obtained from the Department of Environment Malaysia from July 2012 to December 2017, whereas weather data such as rainfall were obtained from the Department of Irrigation and Drainage Malaysia and relative humidity, temperature, wind speed, and atmospheric pressure were obtained from the Malaysian Meteorological Department in the same period. Global Moran’s I, kernel density estimation, and Getis-Ord Gi* statistics were applied to identify the spatial pattern of TB cases. Ordinary least squares (OLS) and geographically weighted regression (GWR) models were used to determine the spatial association of sociodemographic and environmental factors with the TB cases. Multiple linear regression (MLR) and artificial neural network (ANN) were applied to develop the prediction model of TB cases. A web-GIS application was set up in the Python Shapefile (PHP) CodeIgniter framework with the aid of ArcGIS JavaScript Application Programming Interface (API) 3.7 and HyperText Markup Language (HTML), Cascading Style Sheet (CSS), JavaScript, and PHP as programming languages. The ESRI map was used as the base map and combined with the web GIS technology via ArcGIS API. Results: Spatial autocorrelation analysis indicated that the cases were clustered (p<0.05) over five-year period and years 2016 and 2017. Kernel density estimation identified the high-density regions while Getis-Ord Gi* statistics observed the hotspot locations, whereby its were consistently located in the southwestern part of the district. This could be attributed to the overcrowding of inmates in the Sungai Buloh prison located there. The GWR model based on the environmental factor (GWR2) was the best model to determine the spatial distribution of TB cases based on the highest values of R2 i.e. 0.98 and local R2 > 0.70, which consisted of 2006 cases of TB. The ANN was found to be superior to MLR with higher adjusted R2 values in predicting TB cases, in which the ranges were from 0.35 to 0.47 compared to 0.07 to 0.14. The sensitivity analysis of the relative important of each input variable illustrated that using both the sociodemographic and environmental data through ANN3, with highest adjusted R2 value of 0.47, errors below 6, and accuracies above 96%, revealed the best performance in predicting TB cases than using the sociodemographic and environmental data individually for each ANN model. The web-GIS application displays the location of TB cases and its sociodemographic factors on an interactive map. Conclusion: This study identified the spatial variability in the association between risk factors and TB cases, and visualized the high risk areas using a user-friendly web mapping application, which helps in improving case detection and targeted surveillance. The prediction of TB cases were possible with the utilization of geospatial data.