Integrated geophysical and geotechnical investigation using machine learning techniques

Due to the complexity of subsurface conditions, several boreholes are required to obtain subsurface information for any proposed project site. Manual interpretation of large datasets from the boreholes occurs cumbersome and time-consuming. Similarly, applying automated geophysical methods such as se...

Full description

Saved in:
Bibliographic Details
Main Author: Alel, Mohd. Nur Asmawisham
Format: Thesis
Language:English
Published: 2022
Subjects:
Online Access:http://eprints.utm.my/103013/1/MohdNurAsmawishamPSKA2022.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Due to the complexity of subsurface conditions, several boreholes are required to obtain subsurface information for any proposed project site. Manual interpretation of large datasets from the boreholes occurs cumbersome and time-consuming. Similarly, applying automated geophysical methods such as seismic refraction and/or resistivity surveys to obtain and analyse soil parameters is difficult because no samples can be collected to ascertain the information. Therefore, an effective and efficient approach to gather and interpret a large volume of subsurface data is desirable. The effective and efficient approach can be accomplished by combining the applications of the borehole, geophysical surveys, and Machine Learning (ML). Geophysical surveys are employed to reduce the number of boreholes required so that the overall cost of site investigation can be reduced. Hence, this research aims to develop an intelligent model using Machine Learning Algorithms (MLAs) to predict the profiles of soil properties and characteristics, based on boreholes and geophysical investigations data. Five (5) locations in Johor Bahru, Johor, with similar site characteristics and subsurface lithology, were selected for this study. A total of twenty (20) boreholes and laboratory test results were referred to in obtaining the required information such as soil types, Standard Penetration Test number (SPT-N), moist and dry densities, Atterberg’s limits, and specific gravity to be analyses in developing the algorithm. Python, a high-level general-purpose programming language, was employed to code the MLAs such as k-Nearest Neighbour (kNN), Random Forest (RF), Neural Network (NN), Linear Regression (LR) and AdaBoost. A total of 5,532,000 datasets were compiled for the prediction analysis and cross-validation to assess the efficiency of the MLAs. Statistical models and incorporated empirical values for soil were proposed utilising the multiple linear regression (MLR) method in order to develop new equations for estimating the SPT-N value. Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Coefficient of Determination (R2) were employed as the performance metrics to evaluate the differences of MLAs results. For model selection, the lowest values of MSE (3.909), RMSE (1.950) and MAE (0.578) and the highest R2 (0.987) were considered. The results show that the AdaBoost model was remarkably capable of better predicting the soil parameter values than the other ML models. Thus, it has been shown that the development of such an algorithm will reduce the cumbersome and time-consuming processes in interpreting subsurface data and at the same time, be cost-effective. Predicting the parameters also give better engineering information and descriptions of the sites and their materials.