Nadarajah, Ramesh ORCID: https://orcid.org/0000-0001-9895-9356 (2023) Prediction of new-onset atrial fibrillation using routinely available clinical data. PhD thesis, University of Leeds.
Abstract
Atrial fibrillation (AF) is common and associated with increased risk of stroke, heart failure and death, yet a fifth of AF disease burden is estimated to be undiagnosed. Screening for AF can increase early detection of AF and associated guideline-directed treatment, but is limited by low yields of newly detected AF. A scalable strategy is required to identify high-risk individuals to make screening for AF more efficient. In the United Kingdom (UK), 98% of the population are registered in primary care with a routinely-collected electronic health record (EHR). The aim of my thesis was to design and evaluate a prediction model that estimates risk of new-onset AF using nationwide routinely-collected primary care EHR data.
A systematic review and meta-analysis was completed to establish the current knowledge base and to inform quantitative analysis. Multivariable prediction models developed and/or validated for incident AF in community-based EHRs were summarised and measures of discrimination performance synthesised. Models eligible for meta-analysis demonstrated only moderate discrimination performance and predicted AF risk over a long prediction horizon, which may be less relevant to guiding AF screening. Models developed with machine learning produced stronger prediction performance for new-onset AF than models developed with traditional regression techniques. Knowledge gaps observed in the systematic review were used to formulate the protocol for developing a novel prediction model for new-onset AF.
Studies were conducted using UK primary care EHRs of 2 081 139 individuals aged 30 years and older without a preceding diagnosis of AF or atrial flutter. A prediction model for incident AF within the next 6 months was developed using a Random Forest classifier (Future Innovations in Novel Detection of Atrial Fibrillation, FIND-AF). FIND-AF could be applied to all EHRs in the dataset and demonstrated excellent discrimination performance on internal validation in the holdout testing dataset (area under the receiver operating characteristic curve [AUROC] 0.824, 95% CI 0.814-0.834). Discrimination performance was robust in both men (AUROC 0.819, 95% CI 0.809-0.829) and women (AUROC 0.821, 95% CI 0.810-0.831), and across different ethnic groups (AUROC, White 0.810, 95% CI 0.799-0.821; Asian 0.796, 95% CI 0.693-0.893; Black, 0.801, 95% CI 0.680-0.973; other non-White ethnic minority, 0.805, 95% CI 0.765-0.845; and ethnicity unrecorded 0.823, 95% CI 0.770-0.875).
The EHRs in the testing dataset were then used to determine the association of higher predicted risk of AF and the occurrence of other cardio-renal-metabolic diseases and death. Cumulative incidence rates were calculated and Fine and Gray’s models fitted at 1, 5, and 10 years for nine diseases and death adjusting for competing risks. Higher predicted risk of AF, compared with lower predicted risk, was associated with higher risk of each of the outcomes (hazard ratio [HR], heart failure 12.54, 95% CI 12.08-13.01; aortic stenosis 9.98, 95% CI 9.16-10.87; stroke/transient ischaemic attack 8.07, 95% CI 7.80-8.34; chronic kidney disease 6.85, 95% CI 6.70-7.00; peripheral vascular disease 6.62, 95% CI 6.28-6.98; valvular heart disease 6.49, 95% CI 6.14-6.85; myocardial infarction 5.02, 95% CI 4.82-5.22; diabetes mellitus 2.05, 95% CI 2.00-2.10; chronic obstructive pulmonary disease 2.02, 95% CI 2.00-2.05; and death 10.45, 95% CI 10.23-10.68), including after adjustment for age, sex, ethnicity, and presence of any of the other outcomes at baseline.
Research grant funding was applied for and awarded to conduct a prospective clinical validation study of the performance of FIND-AF. Ethics approval was achieved and a study protocol formulated to implement the algorithm in the UK primary care setting and establish the yield of new AF across risk estimates when electrocardiogram monitoring is conducted.
Parsimonious regression-based prediction models for new-onset AF were also developed and internally validated for prediction horizons extending from 6 months (AUROC 0.803, 95% CI 0.789-0.821) to 10 years (AUROC 0.780, 95% CI 0.777-0.784), with the aim that these can be applied outside of an EHR setting as a web-based app or risk scoring system, and be used to guide both screening and primary prevention interventions for AF.
In conclusion, my PhD has developed and evaluated novel prediction models for new-onset AF using EHR data routinely recorded in primary care. Such an endeavour addresses an unmet clinical need to efficiently guide AF screening at a population level, in the face of unacceptable morbidity when AF is only diagnosed after the first complication. The results of my PhD will not only provide a means to test the effectiveness of a risk-guided AF screening strategy in clinical studies, but also to further characterise individuals with the machine learning-derived EHR phenotype of higher predicted AF risk to determine if this is an actionable target to further improve patient outcomes.
Metadata
Supervisors: | Gale, Chris and Wu, Jianhua |
---|---|
Related URLs: | |
Keywords: | atrial fibrillation; prediction; primary care; machine learning |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Medicine and Health (Leeds) > Leeds Institute of Genetics, Health and Therapeutics (LIGHT) > Centre for Epidemiology & Biostatistics (Leeds) The University of Leeds > Faculty of Medicine and Health (Leeds) The University of Leeds > Faculty of Medicine and Health (Leeds) > School of Medicine (Leeds) The University of Leeds > Faculty of Medicine and Health (Leeds) > School of Medicine (Leeds) > Academic Unit of Epidemiology and Health Services Research (Leeds) |
Academic unit: | School of Medicine |
Depositing User: | Dr Ramesh Nadarajah |
Date Deposited: | 29 Jan 2024 15:15 |
Last Modified: | 29 Jan 2024 15:15 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:34045 |
Download
Final eThesis - complete (pdf)
Embargoed until: 1 February 2026
Please use the button below to request a copy.
Export
Statistics
Please use the 'Request a copy' link(s) in the 'Downloads' section above to request this thesis. This will be sent directly to someone who may authorise access.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.