White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Extending the Graphical Representation of four KEGG Pathways for a Better Understanding of Prostate Cancer Using Machine Learning of Graphical models

ALORAINI, ADEL ABDULLAH M (2011) Extending the Graphical Representation of four KEGG Pathways for a Better Understanding of Prostate Cancer Using Machine Learning of Graphical models. PhD thesis, University of York.

[img]
Preview
Text
ThesisMain.pdf
Available under License Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 UK: England & Wales.

Download (4Mb)

Abstract

This thesis shows a novel contribution to computational biology alongside with developed machine learning methods. It shows how the graphical representation of KEGG pathways can be refined using machine learning of graphical models. The focus mainly is on a set of graphical models called Bayesian networks. Throughout this thesis , different ways of learning Bayesian networks are discussed. The work is based on Affymetrix gene expression microarray profiles and penalised Gaussian linear models. Penalisation in linear models includes choosing the most important parents and estimating the associated coefficients simultaneously using L1-regression. The sparse dataset that is generated from Affymetrix microarray technology is the key point in this thesis when learning Bayesian networks. Thus, the work in this thesis can be viewed as developing robust methods to avoid overfitting that usually associated with gene expression datasets and contributing to invoke more details about a well known discrepancy in KEGG pathways. So,the problem we have is to learn from a large number of candidates, small samples,(p>>n), and for such problem the goal is to apply model selection methods that hopefully achieve an accurate prediction , interpretable models, and stable models. The prediction and the most powerful predictors can be improved by using methods that trade-off between bias and variance. Also, providing which predictors are meaningful rather than using all predictors will provide interpretable models, and finally by choosing the most important predictors, a small change in the data will not result in large changes in the subset of predictors which consequently gives the stability to the models that are learnt.

Item Type: Thesis (PhD)
Academic Units: The University of York > Computer Science (York)
Depositing User: MR ADEL ABDULLAH M ALORAINI
Date Deposited: 08 Nov 2011 15:13
Last Modified: 08 Aug 2013 08:47
URI: http://etheses.whiterose.ac.uk/id/eprint/1711

Actions (repository staff only: login required)