Alharbi, Emad ORCID: https://orcid.org/0000-0001-8476-4865 (2022) Improving the Performance of Protein Model Synthesis from Electron-Density Maps. PhD thesis, University of York.
Abstract
Proteins are large biological molecules and the building blocks of all cells in living organisms. Modelling their structure supports the understanding of their role in key biological processes, including the onset, evolution and cure of diseases. Nevertheless, protein model building is extremely challenging. Although the computational tools for protein model building (e.g., from crystallographic data sets) have improved significantly in recent years, they still perform poorly for protein structures for which only data sets with low resolution and affected by poor phase distributions are available.
This thesis introduces new methods that support and improve model building for such protein structures. We start with a systematic evaluation of all major automated crystallographic model-building pipelines using 1211 protein structures (202 at original resolution and 1009 at truncated resolutions). Using the results of this study as a baseline, we then propose and show the effectiveness of using pairwise pipeline combinations to build better protein models for many crystallographic data sets.
As the performance of individual pipelines and pipeline combinations depends on the input data set, we introduce a predictive machine learning model that recommends pipelines or pipeline combinations suitable for a given data set, helping researchers avoid the time-consuming running of pipelines likely to perform poorly. The model bases its predictions on statistical features calculated from the electron-density map, and is available as a freely accessible web application.
Finally, we introduce a neural network trained to recognise incorrect parts of a protein model during the building process. Developed using large training data sets newly created for this purpose, and integrated into the protein model building software Buccaneer, the neural networks enables Buccaneer to avoid these incorrect parts and to produce protein models with significantly improved completeness and fitting measures to crystallography data.
Metadata
Supervisors: | Calinescu, Radu and Cowtan, Kevin |
---|---|
Related URLs: | |
Keywords: | software, Buccaneer, model building, protein |
Awarding institution: | University of York |
Academic Units: | The University of York > Computer Science (York) |
Identification Number/EthosID: | uk.bl.ethos.858870 |
Depositing User: | Mr Emad Alharbi |
Date Deposited: | 20 Jun 2022 10:44 |
Last Modified: | 21 Aug 2022 09:53 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:30899 |
Download
Examined Thesis (PDF)
Filename: Alharbi_Thesis.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.