학술논문
Reduction of Input Features from Machine Learning Datasets for Water Quality Analysis
Document Type
Conference
Source
2024 International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA) Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA), 2024 International Conference on. :1-6 Feb, 2024
Subject
Language
Abstract
Typical water quality testing methods used in water treatment organizations are very complex, time consuming, and expensive. Because these methods require enormous amounts of input features in the datasets. Studies show that machine learning has potential to help analyze water quality. This study employs a method to reduce the number of input features applying machine learning techniques, allowing frequent water tests at a lower cost. First, recursive feature elimination with cross-validation (RFECV), permutation importance (PI), and random forest (RF) techniques are used to identify the most prominent features. Second, artificial neural network (ANN) and support vector machine (SVM) are used to evaluate that the accuracy due to the reduced features is acceptable. A dataset from Kaggle with nine features and 2011 data points is used in this study. Experimental results show that the dataset with five features produces