Abstract
The increasing sophistication of grid-connected photovoltaic (GCPV) systems necessitates
advanced fault detection and diagnosis (FDD) methods to ensure operation efficiency
and security. In this paper, a novel two-stage hybrid AI architecture is analyzed that
couples an autoencoder using Long Short-Term Memory (LSTM) for unsupervised anomaly
detection with an RF classifier for focused fault diagnosis. The architecture is critically
compared to that of a baseline-only RF baseline on a synthetic dataset. The results of
this two-stage hybrid AI show a strong overall accuracy of (83.1%). The hybrid model’s
first stage trains only on unlabeled healthy data, reducing the reliance on extensive and
often unavailable labeled fault datasets. This design has the safety-critical advantage
of marking unfamiliar faults as anomalies instead of committing to a misclassification.
By integrating anomaly detection with classification, the architecture enables early stage
screening of faults and targeted categorization, even in data-scarce scenarios. This offers a
scalable, interpretable solution suitable for deployment in real-world GCPV systems where
robustness and early detection are critical. While the method exhibits reduced sensitivity to
subtle or recurring faults, it demonstrates strong reliability in confidently detecting distinct
and significant anomalies. Additionally, the approach improves interpretability, facilitating
clearer identification of performance constraints such as the autoencoder’s moderate fault
sensitivity (AUC = 0.61). This study confirms the hybrid approach as a very promising
FDD solution, in which the architectural advantages of safety and maintainability offer a
more worthwhile proposition to real-world systems than incremental improvements in a
single accuracy measure.