
Integrating machine learning and multi-linear regression modeling approaches in groundwater quality assessment around Obosi, SE Nigeria
Document Type
Springer, Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development. 25(12):14567-14606
This study integrated machine learning and multi-linear regression modeling approaches in groundwater quality assessment around Obosi, SE Nigeria with the aim of predicting groundwater quality parameters. 42 groundwater samples were collected and analyzed for physical (Ec and pH) and heavy metals (As, Fe, Ni, Cu, Cr, Pb) using standard methods. The result of physicochemical assessment revealed that Ec, pH, Cr, and Ni were within the recommended standard whereas Cd, Fe, Pb, and As were above with a heavy metal trend spread of Fe > Pb > As > Cd > Cr > Ni. The study area’s aquiferous materials are highly permeable and relatively shallow, hence the observed groundwater contamination. Contamination factor results showed that the entire samples were not contaminated, except which had majority of its samples having very high degree of contamination. Pollution load index (PLI) values revealed excellent groundwater quality. The Heavy Metal Evaluation Index (HEI), Potential ecological risk index (ERI) and modified degree of contamination (mCd) values reveal that majority of the entire groundwater sample has high contamination. Elemental Contamination Index and overall Metal Contamination Index (MCI) values in the entire sample were less than 5 which implies that they have very low contamination. Water quality index (WQI) and Pollution index of groundwater (PIG) values showed that the water quality is unsuitable for drinking and it requires treatment before usage. Correlation matrix result showed no correlation between the parameters. Principal component analysis results showed that there were loading between parameters. Eight (8) Artificial neural networks (ANN) and multi-linear regression (MLR) models were developed with very high R2 values, showing that they are efficacious and reliable for the forecasting of the pollution indices. Based on the respective performances, ANN and MLR models should be considered for additional investigation since they showed a high viable incl