Search-Based Generation of Human Readable Test Data and Its Impact on Human Oracle Costs

Abstract

The frequent non-availability of an automated oracle makes software testing a tedious manual task which involves the expensive performance of a human oracle. Despite this, the literature concerning the automated test data generation has mainly focused on the achievement of structural code coverage, without simultaneously considering the reduction of human oracle cost. One source of human oracle cost is the unreadability of machine-generated test inputs, which can result in test scenarios that are hard to comprehend and time-consuming to verify. This is particularly apparent for string inputs consisting of arbitrary sequences of characters that are dissimilar to values a human tester would normally generate. The key objectives of this research is to investigate the impact of a seeded search-based test data generation approach on test data oracle costs, and to propose a novel technique that can generate human readable test inputs for string data types.

The first contribution of this thesis is the result of an empirical study in which human subjects are invited to manually evaluate test inputs generated using the seeded and unseeded search-based approaches for 14 open source case studies. For 9 of the case studies, the human manual evaluation was significantly less time-consuming for inputs produced using the seeded approach, while the accuracy of test input evaluation was also significantly improved in 2 cases.

The second contribution is the introduction of a novel technique in which a natural language model is incorporated into the search-based process with the aim of improving the human readability of generated strings. A human study is performed in which test inputs generated using the technique for 17 open source case studies are evaluated manually by human subjects. For 10 of the case studies, the human manual evaluation was significantly less time consuming for inputs produced using the language model. In addition, the results revealed that accuracy of test input evaluation was also significantly enhanced for 3 of the case studies.

Metadata

Supervisors:	McMinn, Phil
Awarding institution:	University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Computer Science (Sheffield) The University of Sheffield > Faculty of Science (Sheffield) > Computer Science (Sheffield)
Identification Number/EthosID:	uk.bl.ethos.581631
Depositing User:	Miss Sheeva Afshan
Date Deposited:	10 Sep 2013 08:12
Last Modified:	03 Oct 2016 10:46
Open Archives Initiative ID (OAI ID):	oai:etheses.whiterose.ac.uk:4337

Download

Thesis

Filename: Thesis.pdf

Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License

CLICK TO DOWNLOAD

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Search-Based Generation of Human Readable Test Data and Its Impact on Human Oracle Costs

Abstract

Metadata

Download

Thesis

Export

Statistics