Antanaviciute, Agne (2017) Novel Algorithm Development for ‘NextGeneration’ Sequencing Data Analysis. PhD thesis, University of Leeds.
Abstract
In recent years, the decreasing cost of ‘Next generation’ sequencing has spawned numerous applications for interrogating whole genomes and transcriptomes in research, diagnostic and forensic settings. While the innovations in sequencing have been explosive, the development of scalable and robust bioinformatics software and algorithms for the analysis of new types of data generated by these technologies have struggled to keep up. As a result, large volumes of NGS data available in public repositories are severely underutilised, despite providing a rich resource for data mining applications. Indeed, the bottleneck in genome and transcriptome sequencing experiments has shifted from data generation to bioinformatics analysis and interpretation.
This thesis focuses on development of novel bioinformatics software to bridge the gap between data availability and interpretation. The work is split between two core topics – computational prioritisation/identification of disease gene variants and identification of RNA N6 -adenosine Methylation from sequencing data.
The first chapter briefly discusses the emergence and establishment of NGS technology as a core tool in biology and its current applications and perspectives.
Chapter 2 introduces the problem of variant prioritisation in the context of Mendelian disease, where tens of thousands of potential candidates are generated by a typical sequencing experiment. Novel software developed for candidate gene prioritisation is described that utilises data mining of tissue-specific gene expression profiles (Chapter 3). The second part of chapter investigates an alternative approach to candidate variant prioritisation by leveraging functional and phenotypic descriptions of genes and diseases from multiple biomedical domain ontologies (Chapter 4).
Chapter 5 discusses N6 AdenosineMethylation, a recently re-discovered posttranscriptional modification of RNA. The core of the chapter describes novel software developed for transcriptome-wide detection of this epitranscriptomic mark from sequencing data. Chapter 6 presents a case study application of the software, reporting the previously uncharacterised RNA methylome of Kaposi’s Sarcoma Herpes Virus. The chapter further discusses a putative novel N6-methyl-adenosine -RNA binding protein and its possible roles in the progression of viral infection.
Metadata
Supervisors: | Carr, Ian and Bonthron, David and Watson, Christopher |
---|---|
Keywords: | NGS, Gene Prioritisation; Disease Genes;Gene Expression;m6A; KSHV |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Medicine and Health (Leeds) > Institute of Molecular Medicine (LIMM) (Leeds) > Section of Genetics (Leeds) The University of Leeds > Faculty of Medicine and Health (Leeds) |
Identification Number/EthosID: | uk.bl.ethos.745536 |
Depositing User: | Miss Agne Antanaviciute |
Date Deposited: | 21 Jun 2018 11:47 |
Last Modified: | 11 Jul 2020 09:53 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:20734 |
Downloads
Final eThesis - complete (pdf)
Filename: Thesis 2.pdf
Description: Main Text
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Final eThesis - complete (pdf)
Filename: Appendix B - Supplementary Datasets.xlsx
Description: Supplementary Data
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Final eThesis - complete (pdf)
Filename: Appendix A - List of Bash Commands Used.docx
Description: Appendix A
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.