Abstract
With the increase in global internet connectivity, optical networks are scaling up to meet consumer demand.
The quality of transmission (QoT) prediction problem arises when inaccurate hyperparameters are used to
estimate the availability of an unestablished light path. This dissertation explores QoT prediction using
machine learning models and hyperparameter techniques for optimal accuracy. Using Microsoft’s optical
dataset, we investigated the performances of various regression methods and the impact of two types of
data transformations. A variety of popular regression techniques were studied, namely, linear regression,
gradient boosting, neural network, random forest and long short-term memory (LSTM). The Box-Cox and
Yeo-Johnson transformations were investigated. The study also investigated the performance of different
possible feature combinations.
The performance of the machine learning model was assessed using the metrics MSE (mean squared error)
and MAE (mean absolute error). The evaluation results showed that untransformed data performed better
than transformed data. The MAE of Linear regression outperformed other models at 0.0107 using the
polarization mode dispersion and chromatic dispersion features (bi-variate), compared with 0.0571 for multivariate
gradient boosting. LSTM generated a significant degree of inaccuracy regarding MSE and MAE in
that dataset. The hyperparameter tuning demonstrated improvement in the model’s performance, making it
suitable for predicting unestablished lights.