Shar, Nisar Ahmed (2016) Statistical methods for predicting genetic regulation. PhD thesis, University of Leeds.
Abstract
Transcriptional regulation of gene expression is essential for cellular differentiation and function, and defects in the process are associated with cancer. Transcription is regulated by the cis-acting regulatory regions and trans-acting regulatory elements. Transcription factors bind on enhancers and repressors and form complexes by interacting with each other to control the expression of the genes. Understanding the regulation of genes would help us to understand the biological system and can be helpful in identifying therapeutic targets for diseases such as cancer. The ENCODE project has mapped binding sites of many TFs in some important cell types and this project also has mapped DNase I hypersensitivity sites across the cell types.
Predicting transcription factors mutual interactions would help us in finding the potential transcription regulatory networks. Here, we have developed two methods for prediction of transcription factors mutual interactions from ENCODE ChIP-seq data, and both methods generated similar results which tell us about the accuracy of the methods. It is known that functional regions of genome are conserved and here we identified that shared/overlapping transcription factor binding sites in multiple cell types and in transcription factors pairs are more conserved than their respective non-shared/non-overlapping binding sites. It has been also studied that co-binding sites influence the expression level of genes. Most of the genes mapped to the transcription factor co-binding sites have significantly higher level of expression than those genes which were mapped to the single transcription factor bound sites.
The ENCODE data suggests a very large number of potential regulatory sites across the complete genome in many cell types and methods are needed to identify those that are most relevant and to connect them to the genes that they control. A penalized regression method, LASSO was used to build correlative models, and choose two regulatory regions that are predictive of gene expression, and link them to their respective gene.
Here, we show that our identified regulatory regions accumulate significant number of somatic mutations that occur in cancer cells, suggesting that their effects may drive cancer initiation and development. Harboring of somatic mutations in these identified regulatory regions is an indication of positive selection, which has been also observed in cancer related genes.
Metadata
Supervisors: | Westhead, David |
---|---|
Keywords: | Genetic regulation, Cis-regulatory regions, Cancer somatic mutations, TF-TF mutual interactions, Transcriptional regulation |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Biological Sciences (Leeds) > Institute for Molecular and Cellular Biology (Leeds) |
Identification Number/EthosID: | uk.bl.ethos.707057 |
Depositing User: | Mr. Nisar Ahmed Shar |
Date Deposited: | 04 Apr 2017 10:52 |
Last Modified: | 25 Jul 2018 09:54 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:16729 |
Downloads
Final eThesis - complete (pdf)
Filename: Shar_Nisar_SMCB_PhD_2016.pdf
Description: Thesis
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Supplementary Material
Filename: Additional file 1.xlsx
Description: Additional file 1
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.