Abstract
Phishing websites continue to be a serious concern to cybersecurity because they take advantage
of users’ trust to steal private data. Using a large dataset obtained from Kaggle, this work
attempted to assess and contrast the efficacy of deep learning (DL), ensemble learning (EL),
and traditional machine learning (TML) models in identifying phishing websites. This study
offers a comprehensive performance analysis across multiple model architectures by utilising
comprehensive preprocessing, balanced class handling, and a variety of assessment metrics, including
accuracy, recall, precision, F1-score, and Receiver Operating Characteristic Area Under
the Curve (ROC-AUC). The findings show that EL models, random forest (RF) and bagging in
particular, repeatedly outperformed alternative strategies in terms of accuracy and resilience,
which makes them ideal for real-time phishing detection systems. Specifically, RF outperformed
the best DL model, multilayer perceptron (MLP) (95.67% accuracy, 0.95 F1-score, 0.99
ROC-AUC), and the best TML model, decision tree (DT) (98.18% accuracy, 0.98 F1-score,
0.98 ROC-AUC), achieving 98.92% accuracy, 0.99 precision, 0.99 recall, 0.99 F1-score, and 1.00
ROC-AUC. RF is suggested as the best model for real-time phishing detection due to its excellent
performance, balanced metrics, and computational economy. Although they have trade-offs
in terms of computing cost and false positive rates, DL algorithms such as convolutional neural
network (CNN) also exhibit encouraging outcomes. This study emphasises useful implications
for strengthening cybersecurity defences and offers insightful information on model selection for
phishing detection.