Improved machine learning methods for classification of imbalanced data

Sarah Alexandria Ebiaredoh-Mienye

Back

Improved machine learning methods for classification of imbalanced data

Thesis

Open access

Improved machine learning methods for classification of imbalanced data

Sarah Alexandria Ebiaredoh-Mienye

M.Ing., University of Johannesburg

2021

Handle:

https://hdl.handle.net/10210/481973

Abstract

Machine learning

Neural networks (Computer science)

M.Ing. (Electrical Engineering) The emergence of Big Data and machine learning (ML) has paved the way for numerous scientific advancements. A challenge which has hindered the progress and application of machine learning algorithms for certain classification tasks is the class imbalance problem. Imbalanced classification is a situation where there is a skewed distribution of the target variables. The class imbalance problem exists in several domains, including medical diagnosis, credit risk prediction, fraud detection, and other areas in which negatively labelled samples considerably exceeds the positively labelled samples. Using imbalanced data to train ML models often results in poor performance. Several research works have proposed diverse methods to mitigate the class imbalance problem, including data sampling, ensemble learning, and feature learning. However, in this research, the focus is on effective feature learning. This dissertation presents two ML methods that are implemented to enhance the performance of diverse classifiers using publicly available imbalanced datasets.  Firstly, a thorough literature review is conducted on various ML algorithms developed to solve the class imbalance problem.  Secondly, a method was developed to improve the classification performance of some classifiers using stacked sparse autoencoder, with application to credit risk prediction.  Thirdly, a method was introduced for medical diagnosis using an enhanced sparse autoencoder and softmax regression. The methods implemented in this research outperformed most machine learning algorithms and scholarly works. Furthermore, this research work demonstrates the effect of effective feature learning on the performance of classifiers and the importance of training these classifiers with relevant data.

Files and links (1)

pdf

Dissertation.pdfDownload View

Open Access

Metrics

28 File views/ downloads

63 Record Views

Details

Title: Improved machine learning methods for classification of imbalanced data
Creators - without role: Sarah Alexandria Ebiaredoh-Mienye
Contributors - without role: Theo G. Swart
Ebenezer Esenogho
Awarding Institution: University of Johannesburg; M.Ing.
Theses and Dissertations: M.Ing., University of Johannesburg
Identifiers: 9912254907691
Copyright: University of Johannesburg
Academic Unit: Electrical and Electronic Engineering Studies
Resource Type: Thesis