Abstract
South Africa is currently facing a severe and persistent energy crisis, driven in part by an
overreliance on coal-based electricity generation and the limitations of the national grid
to meet increasing demand. In response, the adoption of solar photovoltaic systems has
expanded significantly across residential, commercial, and utility-scale sectors. However,
the intermittent and weather-dependent nature of solar energy presents challenges for
effective integration into the power system. Accurate forecasting of solar power output
is therefore essential for improving grid reliability, informing operational planning, and
enabling efficient energy management. This research explores the use of advanced machine
learning techniques to enhance short-term solar power output forecasting within the South
African context.
The study investigates three machine learning models: Long Short-Term Memory (LSTM)
networks, Support Vector Regression (SVR), and Extreme Gradient Boosting (XGBoost).
These models were trained and tested using high-resolution data collected from a custombuilt
solar monitoring system deployed in Johannesburg, combined with meteorological
data obtained through the Solcast API. A comprehensive feature selection and data
preprocessing strategy was implemented to account for local environmental conditions,
system variability, and known data quality challenges. Each model was evaluated based
on forecasting accuracy using metrics such as Root Mean Squared Error, Mean Absolute
Error, and the coefficient of determination.
Among the models tested, XGBoost consistently demonstrated the highest forecasting
accuracy for both solar panels, followed closely by SVR and LSTM. The results reveal
that empirical field data, when combined with robust machine learning methods, significantly
improves forecast reliability compared to traditional specification-based estimates.
Furthermore, the study highlights the limitations of applying globally trained models to
African climates without adaptation and advocates for the development of region-specific
datasets and protocols. The findings underscore the potential for data-driven, scalable,
and cost-effective forecasting systems to support solar energy planning and deployment in
sub-Saharan Africa. This research contributes to the broader field of renewable energy
forecasting and provides actionable insights for stakeholders involved in solar energy
integration, policy development, and infrastructure planning.