Understanding and Mitigating Flaky Software Test Cases

Abstract

A flaky test is a test case that can pass or fail without changes to the test case code or the code under test. They are a wide-spread problem with serious consequences for developers and researchers alike. For developers, flaky tests lead to time wasted debugging spurious failures, tempting them to ignore future failures. While unreliable, flaky tests can still indicate genuine issues in the code under test, so ignoring them can lead to bugs being missed. The non-deterministic behaviour of flaky tests is also a major snag to continuous integration, where a single flaky test can fail an entire build. For researchers, flaky tests challenge the assumption that a test failure implies a bug, an assumption that many fundamental techniques in software engineering research rely upon, including test acceleration, mutation testing, and fault localisation. Despite increasing research interest in the topic, open problems remain. In particular, there has been relatively little attention paid to the views and experiences of developers, despite a considerable body of empirical work. This is essential to guide the focus of research into areas that are most likely to be beneficial to the software engineering industry. Furthermore, previous automated techniques for detecting flaky tests are typically either based on exhaustively rerunning test cases or machine learning classifiers. The prohibitive runtime of the rerunning approach and the demonstrably poor inter-project generalisability of classifiers leaves practitioners with a stark choice when it comes to automatically detecting flaky tests. In response to these challenges, I set two high-level goals for this thesis: (1) to enhance the understanding of the manifestation, causes, and impacts of flaky tests; and (2) to develop and empirically evaluate efficient automated techniques for mitigating flaky tests. In pursuit of these goals, this thesis makes five contributions: (1) a comprehensive systematic literature review of 76 published papers; (2) a literature-guided survey of 170 professional software developers; (3) a new feature set for encoding test cases in machine learning-based flaky test detection; (4) a novel approach for reducing the time cost of rerunning-based techniques for detecting flaky tests by combining them with machine learning classifiers; and (5) an automated technique that detects and classifies existing flaky tests in a project and produces reusable project-specific machine learning classifiers able to provide fast and accurate predictions for future test cases in that project.

Metadata

Supervisors:	Phil, McMinn
Keywords:	software engineering, software testing, flaky tests
Awarding institution:	University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Computer Science (Sheffield) The University of Sheffield > Faculty of Science (Sheffield) > Computer Science (Sheffield)
Depositing User:	Dr Owain Parry
Date Deposited:	08 Nov 2023 14:47
Last Modified:	08 Nov 2023 14:47
Open Archives Initiative ID (OAI ID):	oai:etheses.whiterose.ac.uk:33698

Download

Final eThesis - complete (pdf)

Filename: main.pdf

Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License

CLICK TO DOWNLOAD

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Understanding and Mitigating Flaky Software Test Cases

Abstract

Metadata

Download

Final eThesis - complete (pdf)

Export

Statistics