Thakur, Rohit (2018) Developing statistical and bioinformatic analysis of genomic data from tumours. PhD thesis, University of Leeds.
Abstract
Previous prognostic signatures for melanoma based on tumour transcriptomic data were developed predominantly on cohorts of AJCC (American Joint Committee on Cancer) stages III and IV melanoma. Since 92% of melanoma patients are diagnosed at AJCC stages I and II, there is an urgent need for better prognostic biomarkers to allow patient stratification for receiving early adjuvant therapies.
This study uses genome-wide tumour gene expression levels and clinico-histopathological characteristics of patients from the Leeds Melanoma Cohort (LMC). Several unsupervised and supervised classification approaches were applied to the transcriptomic data, to identify biological classes of melanoma, and to develop prognostic classification models respectively.
Unsupervised clustering identified six biologically distinct primary melanoma classes (LMC classes). Unlike previous molecular classes of melanoma, the LMC classes were prognostic in both the whole LMC dataset and in stage I tumours. The prognostic value of the LMC classes was replicated in an independent dataset, but insufficient data were available to replicate in an AJCC stage I subset.
Supervised classification using the Random Forest (RF) approach provided improved performances when adjustments were made to deal with class imbalance, while this did not improve performance of the Support Vector Machine (SVM). However, RF and SVM had similar results overall, with RF only marginally better. Combining clinical and transcriptomic information in the RF further improved the performance of the prediction model in comparison to using clinical information alone. Finally, the agnostically derived LMC classes and the supervised RF model showed convergence in their association with outcome in some groups of patients, but not in others.
In conclusion, this study reports six molecular classes of primary melanoma with prognostic value in stage I disease and overall, and a prognostic classification model that predicts outcome in primary melanoma.
Metadata
Supervisors: | Barrett, Jenny and Newton-Bishop, Julia and Nsengimana, Jeremie |
---|---|
Keywords: | primary melanoma, prognostic signatures, Stage I prognostic signature, machine learning, transcriptomic data analysis, clustering, PAM, Random Forest, Support Vector Machine |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Medicine and Health (Leeds) > Institute of Molecular Medicine (LIMM) (Leeds) > Section of Epidemiology and Biostatistics (Leeds) |
Identification Number/EthosID: | uk.bl.ethos.772831 |
Depositing User: | Rohit Thakur |
Date Deposited: | 15 Apr 2019 09:15 |
Last Modified: | 11 Mar 2020 10:53 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:22674 |
Download
Final eThesis - complete (pdf)
Filename: Thakur_R_Medicine_PhD_2018.pdf
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.