Time series support vector regression models with missing data treatments for water level prediction

Rise in water level is an important issue because it can be used as an indicator for flood alert. The water level of a river is dependent upon variables such as the month, volume of rainfall, temperature, relative humidity and surface wind. The main purpose of this research is to find a suitable met...

Full description

Saved in:
Bibliographic Details
Main Author: Ibrahim, Noraini
Format: Thesis
Language:English
Published: 2014
Subjects:
Online Access:http://eprints.utm.my/id/eprint/48022/25/NorainiIbrahimMFC2014.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Rise in water level is an important issue because it can be used as an indicator for flood alert. The water level of a river is dependent upon variables such as the month, volume of rainfall, temperature, relative humidity and surface wind. The main purpose of this research is to find a suitable method to predict the water level of Galas River in Kelantan to anticipate flood. In this research, secondary data on water level of Galas River was collected from the Department of Irrigation and Drainage Malaysia and Malaysian Meteorological Department. Some of the data were missing in certain months, thus these data were replaced by the use of means and linear regression based on the related months in other years as treatments of these missing data. Both these treatments were included in the methods to analyse data. Multiple Linear Regression (MLR), Partial Least Squares Regression (PLSR), Support Vector Regression (SVR) and SVR-based time series regression were used to analyse the data. Using the MLR analysis, multicollinearity was detected and addressed by applying PLSR. However, this technique which is a linear based model may not be appropriate in a nonlinear case such as the Galas River case. In this study, a nonlinear method, SVR, was applied. Besides that, SVR-based time series regression was proposed to cater for the time-based water level data, and to overcome the issue of linearity and multicollinearity. The result shows that linear regression is a better data treatment in SVR and SVR-based time series regressions. In addition, using Gaussian kernel, the results showed that these regressions have lower mean squared error of cross-validation as compared to MLR and PLSR. The major finding from this study is that both SVR and SVR-based time series regression used to anticipate flood by predicting the water level is significantly better than MLR and PLSR.