Development of machine learning model for predicting possible side-effects of computationally synthesized lead hypertension drugs molecules

Takudzwa  Ndhlovu

Back

Development of machine learning model for predicting possible side-effects of computationally synthesized lead hypertension drugs molecules

Thesis

Open access

Development of machine learning model for predicting possible side-effects of computationally synthesized lead hypertension drugs molecules

Takudzwa Ndhlovu

Master of Artificial Intelligence, University of Johannesburg

2024

Handle:

https://hdl.handle.net/10210/519312

Abstract

Hypotensive agents - Side effects

Artificial intelligence - Medical applications

Machine Learning

Computational Chemistry

Hypertension is a critical public health challenge that affects over one billion adults globally. While medications and treatment continue to improve, only 20% of those with hypertension are successfully managing their condition. A significant factor contributing to this issue is the debilitating side effects associated with antihypertensive drugs, such as headaches, heart palpitations, hypotension, and hyperkalemia. These adverse effects often lead to poor patient adherence to prescribed treatments. With the advent of artificial intelligence (AI) in computational drug discovery, a large number of novel lead antihypertensive drug molecules are being generated. However, predicting the potential side effects of these newly synthesized compounds remains a significant challenge, especially given the complexity of biological interactions and the limited availability of extensive data on these molecules. Traditional methods for evaluating drug side effects are time-consuming, expensive, and only applicable in the later stages of drug development. This research addresses these challenges by developing a machine learning model, specifically a gradient boosting classifier, to predict the side effects of AI-generated hypertension drug candidates early in the drug development process. The model was trained on engineered features derived from molecular structures, including functional groups and molecular properties, and leveraged SMOTE oversampling to address data imbalance. Using cross-validation for evaluation demonstrated the model’s strong performance, with high recall and AUC-ROC scores across multiple side effect categories. Key findings include the identification of polar surface area and hydrogen bond donors as significant predictors of adverse effects. The gradient boosting classifier outperformed the baseline random forest model, achieving an average F1 score of 87.22%, with a 34.43% improvement in AUC-ROC after oversampling. Functional group analysis revealed key insights into chemical predictors of side effects, with groups such as phenols and carboxylic acids prominently influencing multiple adverse conditions. When tested against real-world hypertension drug data, the model demonstrated strong predictive capabilities for respiratory and cardiovascularside effects. The research believes that streamlining side effect prediction contributes to the potential for safer and more effective antihypertensive drugs.

Files and links (1)

pdf

T_Ndhlovu_2170010632.77 MBDownload View

Open Access

Metrics

2 File views/ downloads

1 Record Views

Details

Title: Development of machine learning model for predicting possible side-effects of computationally synthesized lead hypertension drugs molecules
Creators - without role: Takudzwa Ndhlovu
Contributors - without role: Uche Kennedy Okonkwo
Awarding Institution: University of Johannesburg; Master of Artificial Intelligence
Theses and Dissertations: Master of Artificial Intelligence, University of Johannesburg
Identifiers: 9961205607691
Copyright: University of Johannesburg
Academic Unit: University of Johannesburg; Faculty of Engineering & the Built Environment
Language: English
Resource Type: Thesis