Novel Algorithm Development for ‘NextGeneration’ Sequencing Data Analysis

Abstract

In recent years, the decreasing cost of ‘Next generation’ sequencing has spawned numerous applications for interrogating whole genomes and transcriptomes in research, diagnostic and forensic settings. While the innovations in sequencing have been explosive, the development of scalable and robust bioinformatics software and algorithms for the analysis of new types of data generated by these technologies have struggled to keep up. As a result, large volumes of NGS data available in public repositories are severely underutilised, despite providing a rich resource for data mining applications. Indeed, the bottleneck in genome and transcriptome sequencing experiments has shifted from data generation to bioinformatics analysis and interpretation.
This thesis focuses on development of novel bioinformatics software to bridge the gap between data availability and interpretation. The work is split between two core topics – computational prioritisation/identification of disease gene variants and identification of RNA N6 -adenosine Methylation from sequencing data.
The first chapter briefly discusses the emergence and establishment of NGS technology as a core tool in biology and its current applications and perspectives.
Chapter 2 introduces the problem of variant prioritisation in the context of Mendelian disease, where tens of thousands of potential candidates are generated by a typical sequencing experiment. Novel software developed for candidate gene prioritisation is described that utilises data mining of tissue-specific gene expression profiles (Chapter 3). The second part of chapter investigates an alternative approach to candidate variant prioritisation by leveraging functional and phenotypic descriptions of genes and diseases from multiple biomedical domain ontologies (Chapter 4).
Chapter 5 discusses N6 AdenosineMethylation, a recently re-discovered posttranscriptional modification of RNA. The core of the chapter describes novel software developed for transcriptome-wide detection of this epitranscriptomic mark from sequencing data. Chapter 6 presents a case study application of the software, reporting the previously uncharacterised RNA methylome of Kaposi’s Sarcoma Herpes Virus. The chapter further discusses a putative novel N6-methyl-adenosine -RNA binding protein and its possible roles in the progression of viral infection.

Metadata

Supervisors:	Carr, Ian and Bonthron, David and Watson, Christopher
Keywords:	NGS, Gene Prioritisation; Disease Genes;Gene Expression;m6A; KSHV
Awarding institution:	University of Leeds
Academic Units:	The University of Leeds > Faculty of Medicine and Health (Leeds) > Institute of Molecular Medicine (LIMM) (Leeds) > Section of Genetics (Leeds) The University of Leeds > Faculty of Medicine and Health (Leeds)
Identification Number/EthosID:	uk.bl.ethos.745536
Depositing User:	Miss Agne Antanaviciute
Date Deposited:	21 Jun 2018 11:47
Last Modified:	11 Jul 2020 09:53
Open Archives Initiative ID (OAI ID):	oai:etheses.whiterose.ac.uk:20734

Downloads

Final eThesis - complete (pdf)

Filename: Thesis 2.pdf

Description: Main Text

Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License

CLICK TO DOWNLOAD

Final eThesis - complete (pdf)

Filename: Appendix B - Supplementary Datasets.xlsx

Description: Supplementary Data

Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License

CLICK TO DOWNLOAD

Final eThesis - complete (pdf)

Filename: Appendix A - List of Bash Commands Used.docx

Description: Appendix A

Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License

CLICK TO DOWNLOAD

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Novel Algorithm Development for ‘NextGeneration’ Sequencing Data Analysis

Abstract

Metadata

Downloads

Final eThesis - complete (pdf)

Final eThesis - complete (pdf)

Final eThesis - complete (pdf)

Export

Statistics