Publication Details

Reference Category Journals
DOI / URL link
Creative Commons Licence creative commons licence
Title (Primary) Evaluating the predictive power of different machine learning algorithms for groundwater salinity prediction of multi-layer coastal aquifers in the Mekong Delta, Vietnam
Author Tran, D.A.; Tsujimura, M.; Ha, N.T.; Nguyen, V.T.; Binh, D.V.; Dang, T.D.; Doan, Q.-V.; Bui, D.T.; Ngoc, T.A.; Phu, L.V.; Thuc, P.T.B.; Pham, T.D.;
Journal Ecological Indicators
Year 2021
Department HDG;
Volume 127
Language englisch;
Keywords CatBoost Regression; Influencing factors; Groundwater salinization; Multi-layer coastal aquifers; Mekong Delta
Abstract Groundwater salinization is considered as a major environmental problem in worldwide coastal areas, influencing ecosystems and human health. However, an accurate prediction of salinity concentration in groundwater remains a challenge due to the complexity of groundwater salinization processes and its influencing factors. In this study, we evaluate state-of-the-art machine learning (ML) algorithms for predicting groundwater salinity and identify its influencing factors. We conducted a study for the coastal multi-layer aquifers of the Mekong River Delta (Vietnam), using a geodatabase of 216 groundwater samples and 14 conditioning factors. We compared the predictive performances of different ML techniques, i.e., the Random Forest Regression (RFR), the Extreme Gradient Boosting Regression (XGBR), the CatBoost Regression (CBR), and the Light Gradient Boosting Regression (LGBR) models. The model performance was assessed by using root-mean-square error (RMSE), coefficient of determination (R2), the Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC). The results show that the CBR model has the highest performance with both training (R2 = 0.999, RMSE = 29.90) and testing datasets (R2 = 0.84, RMSE = 205.96, AIC = 720.60, and BIC = 751.04). Ten of the 14 influencing factors, including the distance to saline sources, the depth of screen well, the groundwater level, the vertical hydraulic conductivity, the operation time, the well density, the extraction capacity, the thickness of the aquitard, the distance to fault, and the horizontal hydraulic conductivity are the most important factors for groundwater salinity prediction. The results provide insights for policymakers in proposing remediation and management strategies for groundwater salinity issues in the context of excessive groundwater exploitation in coastal lowland regions. Since the human-induced influencing factors have significantly influenced groundwater salinization, urgent actions should be taken into consideration to ensure sustainable groundwater management in the coastal areas of the Mekong River Delta.
ID 24625
Persistent UFZ Identifier https://www.ufz.de/index.php?en=20939&ufzPublicationIdentifier=24625
Tran, D.A., Tsujimura, M., Ha, N.T., Nguyen, V.T., Binh, D.V., Dang, T.D., Doan, Q.-V., Bui, D.T., Ngoc, T.A., Phu, L.V., Thuc, P.T.B., Pham, T.D. (2021):
Evaluating the predictive power of different machine learning algorithms for groundwater salinity prediction of multi-layer coastal aquifers in the Mekong Delta, Vietnam
Ecol. Indic. 127 , art. 107790