Carey, Sian (2025) Evaluating Fairness in Machine Learning for Structured Medical Data. Integrated PhD and Master thesis, University of Leeds.
Abstract
Fair artificial intelligence (AI) and machine learning (ML) is a rapidly growing field of research, and particularly important within applications of AI in healthcare. Ensuring that models perform fairly is an important step towards the ethical implementation of AI. In this work I focus on answering three research questions, which consider how medical ML, fairness and simulated data interact, across four case studies. The first two case studies both involve investigating medical ML models for fairness. Through the evaluation of a decision support tool for sepsis treatment and bit’s suitability for maternal sepsis, I compare current fairness evaluation techniques. I provide clinical contribution through the evaluation of pre-operative mortality risk tools that are currently recommended for use in the UK, identifying that they are fair, but underperforming. From these investigations I identify multiple areas where evaluation methods need to develop and improve, and pick one to then focus on. The remainder of my work focuses on this area, considering how lack of real-world data affects fairness evaluations and data simulation techniques that can assist with the problem. I use a data simulation technique to generate data with varying levels of bias to evaluate off-the-shelf ML methods for fairness, showing that this is possible without real-world underlying data. My results show that random forest and k-nearest neighbour models are more likely to be impartial to bias regardless of the bias in the training data. I then investigate whether this simulation technique is suitable in more complicated situations, comparing real-world data to data simulated from the real-world data and data simulated from a graph. My experiments show that whilst it creates data similar to the real world, it is not usable for model training. Throughout this work, I provide novel contributions to the fields of medical ML, fair ML and the intersection of the two, through my investigations of fairness evaluations in medical ML, performance results on ML models currently in use within the United Kingdom’s National Health Service (NHS) and evaluation of simulated data techniques
Metadata
| Supervisors: | McInerney, Ciarán and Kotze, Alwyn and Lawton, Tom and Habli, Ibrahim and Johnson, Owen and de Kamps, Marc |
|---|---|
| Keywords: | Fair; Fairness; Machine Learning; Artificial intelligence; Medical AI; Bias |
| Awarding institution: | University of Leeds |
| Academic Units: | The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds) |
| Date Deposited: | 01 Apr 2026 14:05 |
| Last Modified: | 01 Apr 2026 14:05 |
| Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:38310 |
Download
Final eThesis - complete (pdf)
Embargoed until: 1 April 2027
Please use the button below to request a copy.
Filename: Thesis_Final_SCarey.pdf
Export
Statistics
Please use the 'Request a copy' link(s) in the 'Downloads' section above to request this thesis. This will be sent directly to someone who may authorise access.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.