Abstract
The advancement of technology has been a huge benefit in many ways. Through this, the cyberphysical
system [CPS] has emerged as a tool for unprecedented transformation across the
board as it has been of huge benefit in many sectors.
A CPS contains an integration of two sub-systems. That is a combination of computational or
cyber subsystems, with components like a communication infrastructure, sensor, and other
computational elements and a physical component for a more efficient, and fast performance.
The CPS comes in many forms, and a smart grid is one of those examples. The smart grid is a
product of combining electricity with a cyber to enhance the functioning of the system,
especially for the distribution of power designed to serve the growing population as this system
expands the capacity carriage of the distribution system.
Since cyber threats have been on the rise, the CPS has also come under different kinds of threats
causing all kinds of damages and breakdowns.
The challenge of cyber threats has called for the need to develop some security measures. There
has also been a pattern of progression in the design of security measures, some of which are
cryptographic, for cyber threats in a smart grid. However, the occurrence and the damaging
effects of cyber threats have been increasing.
Machine Learning (ML) models of different classes ranging from simple ML models to
ensemble and boosting models were introduced into this work for the detection of cyber threats
in smart grids by applying ML algorithms to real-time power data.
The Logistic Regression (LR), Support Vector Machine (SVM), K-Nearest Neighbourhood
(KNN), Random Forest (RF), and Extreme Gradient Boost (XGBoost) were applied to the
collected power data of 78,377 samples containing different types and classes of cyber threats.
The Random Forest (RF) outperformed the other four applied algorithms with an accuracy of
over 95% and over 85% for the classification of cyber threats in the smart grid considered for
the training and testing processes respectively. The implication of this is that the ensemble
supervised ML algorithm is more efficient than the single models in the detection of cyber
threats in a smart grid using this data.