Research on Water Quality Prediction Based on Machine Learning
Abstract
At present, some urban water plants in China have started using chloramine disinfection. So how to determine whether the disinfected water is drinkable? This article collected a water quality prediction data, including indicators such as chloramine and trihalomethanes. Firstly, descriptive statistics and Pearson correlation analysis were conducted between the data of chloramine and trihalomethanes and the target variable (whether it is drinkable). It is known that water quality cannot be judged solely based on these two indicators, so more indicators such as pH value will be used. In order to establish a more accurate prediction model, the dataset is first preprocessed, including statistical analysis of missing values, determination of box plot outliers, and filling with KNN algorithm. Then, feature engineering is performed, including Yeo Johnson transformation, correlation analysis, and calculation of Shap values. Subsequently, the processed data was input into the established Stacking, Voting, and attention based CNN-LSTM classification prediction models. Random search and cross validation were used to train each model, resulting in the optimal hyperparameters for each model. The relevant evaluation indicators for each model were calculated to measure its accuracy.
Show Figures
Share and Cite
Article Metrics
References
- Huang Wei, Yang Wanli, Mei Yuqin, etc Control analysis of disinfection byproduct trihalomethanes in chloramine disinfection [J] Occupational and Health, 2019, 35 (22): 3071-3074.
- Julie Jie, He Kai, Huang Sheng, etc Development and Future Prospects of Water Quality Monitoring Technology Driven by Data Model Coupling [J/OL] People's the Pearl River: 1-15 [2014-06-10].
- Shang Xudong, Duan Zhongxing, Chen Bingsheng, etc Water quality prediction based on bidirectional long short-term memory network combination model [J/OL] Journal of Environmental Science: 1-10 [June 10, 2024].
- Xiao Yanglan, Shen Huirou, Xu Yihan, etc Water quality prediction of Minjiang River Basin based on GBDT-LSTM [J] Journal of Ecological Environment, 2024, 33 (04): 597-606.
- Fu Dunkai, Zhang Yunhui, Xu Xiaojun, etc Progress and Trends in Water Quality Prediction Research Based on Bibliometrics [J] East China Geology, 2024, 45 (01): 88-100.
- Zhang Shuyan, Chen Qibing, Cai Yijie Research on ARIMA Model Prediction of River Water Quality Based on Exponential Smoothing [J] Guangdong Chemical Industry, 2024, 51 (06): 95-98.
- Xiang Xinjian, Xu Honghui, Xie Jianli, etc Research on Water Quality Prediction Based on VMD-TCN-GRU Model [J] People's Yellow River, 2024, 46 (03): 92-97.
- Yang Zhenjian, Pang Ying Water quality prediction model based on GAT-BILSTM Res [J] Journal of Tianjin Chengjian University, 2024, 30 (01): 60-65.
- Xiao Mingjun, Zhu Yichun, Gao Wenyuan, etc Comparison of Water Quality Prediction Methods Based on Different Artificial Neural Networks [J/OL] Environmental Science: 1-10 [June 10, 2024].
- Xu Shengqiang Research on Water Quality Evaluation and Prediction of Handan Yuecheng Reservoir Based on BP Neural Network [J] Shaanxi Water Resources, 2024 (02): 104-105+108.
- Niu Jinghui Key data prediction algorithm for industrial wastewater quality based on GWO XGBoost [J] Industrial Water Treatment, 2024, 44 (01): 184-190.