Lucas, Craig (2007) Prediction of Protein Function Using Statistically Significant Sub-Structure Discovery. PhD thesis, University of Leeds.
Abstract
Proteins perform a vast number of functional roles. The number of protein structures available for analysis continues to grow and, with the development of methods to predict protein structure directly from genetic sequence without imaging technology, the number of structures with unknown function is likely to increase. Computational methods for predicting the function of protein structures are therefore desirable. There are several existing systems for attempting to assign function but their use is inadvisable without human intervention. Methods for searching proteins with shared function for a shared structural feature are often limited in ways that are counterproductive to a general discovery solution. Assigning accurate scores to significant sub-structures also remains an area of development. A method is presented that can find common sub-structures between multiple proteins, without the size or structural limitations of existing discovery methods. A novel measure of assigning statistical significance is also presented. These methods are tested on artificially generated and real protein data to demonstrate their ability to successfully discover statistically significant sub-structures. With a database of such sub-structures, it is then shown that prediction of function for a new protein is possible based on the presence of the discovered significant patterns.
Metadata
Supervisors: | Bulpitt, A.J. |
---|---|
Publicly visible additional information: | Supplied directly by the School of Computing, University of Leeds. |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds) |
Identification Number/EthosID: | uk.bl.ethos.435787 |
Depositing User: | Dr L G Proll |
Date Deposited: | 24 Mar 2011 15:17 |
Last Modified: | 07 Mar 2014 11:23 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:1345 |
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.