Logo image
A machine learning-based approach to predict energy consumption in a full-scale wastewater treatment plant
Thesis   Open access

A machine learning-based approach to predict energy consumption in a full-scale wastewater treatment plant

Yolanda Thobile Gegana
M.Eng., University of Johannesburg
2025
Handle:
https://hdl.handle.net/10210/519309

Abstract

Sustainable engineering Sewage disposal plants Machine Learning Data Mining
This work investigates the application of machine learning models for predicting energy consumption in a full-scale wastewater treatment plant (WWTP). The substantial operational costs associated with energy consumption in WWTPs, particularly in aeration and pumping processes, underscore the urgent need for more accurate predictive models. Traditional linear methods often fail to capture the complex, nonlinear relationships between energy usage and various influencing factors, including environmental conditions, inflow characteristics, and treatment parameters. To address these challenges, this study explores advanced machine learning algorithms, including Histogram Gradient Boosting, Extremely Randomised Trees, Light Gradient Boosting Machine, and Random Forest, to develop adaptive and precise predictive models. Utilising a comprehensive dataset from the East Melbourne WWTP, the research involved preprocessing to manage outliers and variable standardisation. The models were trained and evaluated using cross-validation to ensure robust performance. Feature importance was assessed with SHapley Additive exPlanations (SHAP), revealing key drivers of energy consumption, including total nitrogen, chemical oxygen demand, average inflow, and seasonal influences from temperature and month. Among the evaluated models, the Histogram Gradient Boosting model demonstrated superior performance, achieving a coefficient of determination (R²) of 0.36 and a Root Mean Square Error (RMSE) of 35.352, thus outperforming state-of-the-art models in the literature. The application of SHAP facilitated the interpretation of individual feature importance, making the model both interpretable and actionable for plant operators. The findings indicate that integrating machine learning techniques significantly enhances the predictive performance of energy consumption, enabling more efficient WWTP operations and improved energy management. This study suggests that leveraging nonlinear relationships through advanced algorithms can lead to substantial energy cost reductions, promoting the sustainability of urban water systems.
pdf
Gegana YT 2013322021.87 MBDownloadView
Open Access

Metrics

10 File views/ downloads
18 Record Views

Details

Logo image