Abstract
Recent technological advancements in educational technology have highlighted the increasing need for effective automatic grading systems for descriptive answers in online assessments. This study investigates the viability of natural language processing approaches for the automatic grading. To this end, three distinct approaches were created and evaluated in a design science research approach: a simple text similarity approach, a semantic similarity approach, and a machine learning-based approach. Saunder’s research onion was used as a general guide to organise the study, incorporating design science research in a pragmatic philosophical stance. The design science research approach involved iterative cycles of problem identification, solution design, development, and evaluation. Primary and secondary data were collected and analysed to evaluate algorithm alignment with human marker grading and to assess human marker perceptions of automatic grading. The viability of the developed approaches was evaluated based on key factors: accuracy, effectiveness, efficiency, integrity, reliability, and relevance.
The key findings reveal that the text similarity-based approach offers simplicity and ease of implementation, making it practical for grading straightforward student answers. The semantic similarity approach demonstrated stronger capability by capturing the meaning and context of student answers, leading to stronger performance for complex answers. The machine learning-based approach aligns most closely with human marking, indicating the robustness of machine learning in handling diverse student answer patterns. Human marker perceptions agreed with the effectiveness, integrity, reliability, and relevance of the grading approaches while showing neutrality with their accuracy and efficiency.
The implications of this study suggest that techniques for automatic grading hold promise for enhancing the descriptive answer grading process and reducing the manual workload in educational settings. The findings contribute to the growing body of knowledge on artificial intelligence-driven educational tools and provide a foundation for future research aimed at optimising automatic grading systems.