Paterson, Mary Louisa
ORCID: https://orcid.org/0009-0004-9144-0438
(2025)
AI Analysis of Voice to Aid Laryngeal Cancer Diagnosis.
Integrated PhD and Master thesis, University of Leeds.
Abstract
The use of patient voice as an indicator of vocal pathologies is a common technique used in clinical practice. In recent years, work has started to increase in the use of Artificial Intelligence (AI) for this task. In this thesis, I focus on the detection of laryngeal cancer from patient voice using AI. Through a scoping literature I found that previous work is ad-hoc in its approaches with no clear advantage in any given method. I also found that the data and code used to create these systems is rarely shared, making it difficult for significant advances to be made. It was also found that previous work does not discuss how these systems will be practically implemented in clinical settings. This area suffers from a lack of publicly available data, and so to increase the amount of available data, I produced a new dataset in collaboration with Leeds Teaching Hospitals NHS Trust (LTHT) containing voice recordings from patients referred on the Urgent Suspected Cancer Referral pathway in Leeds. This dataset has been made available for researchers to use in future research. I created benchmark models on other publicly available datasets to classify patients with benign and malignant vocal pathologies. These models improve upon previously published methods, achieving a balanced accuracy of 73.7%, sensitivity of 72.0%, and specificity of 75.4% using audio features only. Integrating demographic and symptom features improved these methods, achieving a balanced accuracy of 83.7%, sensitivity of 84.0%, and specificity of 83.3%. I thoroughly evaluated these methods and their robustness to variations in recording environments in the form of background noise and reverberation, and found that they were highly sensitive to these factors. I also investigated the effects that different recording devices can have on model performance and found that this can also have significant negative effects. These results show that while there is the possibility for such AI system to be clinically implemented, significant steps must first be taken to prove their robustness to input variation likely to be seen in the real world.
Metadata
| Supervisors: | Cutillo, Luisa and Moor, James |
|---|---|
| Related URLs: |
|
| Awarding institution: | University of Leeds |
| Academic Units: | The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds) |
| Date Deposited: | 10 Oct 2025 09:32 |
| Last Modified: | 10 Oct 2025 09:32 |
| Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:37488 |
Download
Final eThesis - complete (pdf)
Filename: MaryPaterson_Thesis_Corrections.pdf
Licence:

This work is licensed under a Creative Commons Attribution NonCommercial ShareAlike 4.0 International License
Related datasets
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.