Clegg, Benjamin Simon ORCID: https://orcid.org/0000-0002-1323-7133 (2021) The application of mutation testing to enhance the automated assessment of introductory programming assignments. PhD thesis, University of Sheffield.
Abstract
Growing cohorts of students enrolled in introductory programming courses reveal a challenge in manual assessment; it is impractical for a tutor to manually evaluate hundreds or even thousands of programs written by students in a timely manner. Furthermore, manual assessment is not always fair; tutors can make mistakes in their assessment. Automated assessment provides a solution to these problems; a computer can evaluate the correctness and style of students’ programs, and generate feedback accordingly, in much less time, and with a high degree of consistency. A particularly widespread approach to do this is test-based automated assessment, in which a tutor writes a test suite to evaluate the correctness of students’ programs, which is automatically executed by a computer to generate a grade and applicable feedback according to the results of these tests.
Such assessment test suites are not necessarily flawless, however. For example, a test suite may not detect some faults present in students’ programs; they may receive inaccurate grades and feedback where their mistakes are missed. In the software engineering industry, adequacy metrics and test goals are often employed to ensure that test suites can detect faults; by achieving such test goals and high adequacy metrics, a test suite should be able to detect faults more reliably. One approach is to measure coverage; which elements of a program are executed by a test suite, and which are not. Naturally, a test suite which exercises more of a program should be more capable of detecting faults. However, executing a program element does not guarantee that a fault within it is detected, for example, some faults only manifest for particular states of the program. Mutation testing offers a different approach to evaluating the adequacy of a test suite. Mutation testing involves generating artificial faulty variants of the program, called mutants, and executing the test suite on each of them. A test suite which detects more of these mutants should be more capable of detecting faults. Furthermore, the undetected mutants can be used to inform the creation of new tests to improve adequacy.
Accordingly, in this thesis I investigate how mutation testing can be used to improve grading test suites. First, I consider how different test suites can generate varying grades for students’ solution programs; is there a risk of inadequate test suites generating unfair grades? I also investigate how different observable properties, including coverage and the detection of mutants, impact such changes in grades. Finally, I evaluate how applicable mutation testing is to improving grading test suites; do the fundamental assumptions of mutation testing hold for students’ programs, and does improving a test suite’s ability to detect artificial faults also improve its ability to detect students’ faults?
Metadata
Download
Final eThesis - complete (pdf)
Filename: ben-clegg_thesis-corrected_2022-03-22.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.