Abstract
The increasing demand for engineers worldwide to fill skill shortages makes improving graduation rates and reducing dropout rates imperative. Understanding the performance patterns of students in engineering and factors influencing students’ choice of and success in science, technology, engineering, and mathematics (STEM) can aid the development of interventions to increase engineering students' success. By leveraging recent advancements in educational data mining (EDM), recommender systems can assist student-faculty advisors by ensuring students receive timely guidance and support, addressing the dropout challenges. EDM has the ability to reveal hidden information, enabling accurate predictions. Still, concerns have arisen about fairness due to its potential to encode social biases present in the analysed data, leading to biased outcomes.
The research was conducted within multidisciplinary and interdisciplinary fields of machine learning, EDM and engineering education. This study investigates the process of harnessing data stored in higher education institutions’ (HEI) repositories effectively to generate insights that can mitigate challenges in engineering education, such as dropouts, qualification changes, and extended completion times. The study also considers biases in data to minimise their impact.
This thesis (in article format) presents a collection of four articles aimed at understanding and creating a solution to the above challenge. The articles focused on four key areas: i) recommender systems in higher education; ii) factors influencing students’ choice of and success in STEM; iii) exploratory analysis of student performance patterns in engineering at a public university in South Africa; and iv) the development of models for predicting student performance and qualification enrolments.
The first study in Chapter Three presents a systematic literature review on recommender systems for elective course selection in higher education, highlighting a limited number of research articles that address this topic. The study reveals the use of various recommender system approaches and data mining algorithms to
xi
recommend elective courses. The findings show the use of different recommender system approaches and data mining algorithms for recommending elective courses while emphasising the need for further exploration in under-investigated areas to enhance the effectiveness of these systems.
Chapter Four contains the second study that focuses on understanding the factors that influence students' choice of and success in STEM programs, which is crucial for informing recruitment and support interventions and improving graduation rates. The study utilised the social cognitive career theory (SCCT) as a theoretical framework and followed a bibliometric analysis workflow to examine the state of research in this area. The study offers theoretical insights for enhancing success rates in STEM qualifications, highlights research gaps, and proposes a research agenda for future investigations.
The third study (Chapter Five) used EDM, exploratory data analysis, and correlation analysis to identify patterns in engineering students' performance and explore factors contributing to dropout rates to develop interventions for improving student success in engineering. The findings indicate gender disparities in engineering enrolments and low graduation rates (13,5% in minimum time). Furthermore, more female students (21%) completed their studies than male students (18%). In comparison, the rates of continuing studies were similar between genders (58% for males and 59% for females), with comparable trends observed in other performance categories. Correlation analysis shows no significant correlations between gender, race, admission point score (APS), mathematics mark, science mark, previous activity, and performance in engineering. Understanding student performance patterns can guide appropriate program choices and interventions to reduce dropout rates and enhance success in engineering education.
The study presented in Chapter Six addressed the challenges of using real-world datasets, such as imbalanced datasets and ethical considerations. The study used the decision trees, light gradient boosting machine (LightGBM), k-nearest neighbours (KNN), random forests and extreme gradient boosting (XGB) models on the dataset. The models were trained to determine feature strengths and predict student performance and qualification enrolment. The study found that APS scores,
xii
mathematics, and physical sciences marks were important features. The developed models demonstrate accurate predictions that can be utilised in a recommender system for academic advising, selecting appropriate qualifications and improving student success.
Understanding recommender systems and the factors influencing students' success in STEM programs and student performance patterns is necessary to develop machine learning models that can be used to address challenges related to the shortage of engineering professionals. The results obtained from the different algorithms illustrate their effectiveness.
Keywords: APS, educational data mining; engineering education, mathematics, physical sciences, recommender systems, STEM, student performance, student retention