Abstract
This work focuses on developing forecasting models for the power output of multiple PV technologies installed at the outdoor test facility on the Pretoria campus of the Council for Scientific and Industrial Research. To forecast the power output of the different photovoltaic (PV) modules, eight machine learning algorithms are trained on historical time-series datasets. The machine learning models are Random Forest (RF), Extreme Gradient Boosting (XGB), Adaboost, Multilinear Perceptron (MLP), Convolutional Neural Networks (CNN), Long Short-
Term Memory (LSTM), Stacked Long Short-Term Memory (S-LSTM), Bidirectional Long Short-Term Memory (Bi-LSTM), and the hybrid Convolution Neural Network-Long Short-
Term Memory (CNN-LSTM). For training, validation, and testing, hourly averages of the measured dataset from January 2019 to November 2019 is used. Sub-hourly data from January 2020 to March 2020 is characterized into monthly records for clear, moderate, and cloudy skies for additional testing. In a pre-processing stage, outliers are located and eliminated from the data. To overcome the shortcomings of traditional methods in solving complex problems, the developed machine learning models are extensively trained on datasets. Different sky conditions and their effects on the prediction errors are studied and analysed. The interrelationship between meteorological and electrical input parameters is evaluated and discussed. The importance of features is determined among the inputs. To evaluate the model's prediction performance, the prediction values are compared with the actual power outputs of the system. The prediction accuracies are quantified using mean squared errors (MSE) and root mean square errors (RMSE). As measured by the RMSE values in the study, LSTM, Bi-LSTM, and RF outperform the other machine learning algorithms by a wide margin, achieving a minimum RMSE value of 6.9 W, 7.1 W, and 7.6 W, respectively, on the Nice module type. The RF algorithm, on the other hand, outperformed the other algorithms in this study with few exceptions in all PV models and all three sky conditions.