Egbelo, Terence
ORCID: 0009-0009-0953-6702
(2025)
Follow the Metapath: Improved Interpretable Completion of Biomedical Knowledge Graphs in Drug Discovery.
PhD thesis, University of Sheffield.
Abstract
Biomedical knowledge graphs (KGs) have emerged as a central tool for integrating and representing the complete spectrum of humanity’s understanding of health, disease and cures. Yet much work remains to realise their potential for surfacing clear and accurate answers to critical questions in drug development, such as the expected potency of druglike compounds and the safety of therapeutic targets. Popular approaches to KG-based inference in the domain suffer from deficiencies in interpretability, unaddressed data imbalances and ill-suited evaluation policies. Moreover, though KG designs and accessibility continue to improve, knowledge resources of great potential value are often overlooked.
This work reports efforts to address these obstacles to effective KG-based inference within two specific tasks in the discovery phase of preclinical drug development. The first, termed bioactivity prediction, concerns the estimation of drug potency. The second is the prediction of associations between therapeutic targets and the adverse events (AEs) patients may experience following treatment.
Adopting an inherently interpretable KG-based inference approach based on features called “metapaths”, this work probes the predictive patterns which can be learnt from the KG structure. In the activity prediction task, it is shown that seemingly powerful signals are often artefacts with little association to the genuine and relevant knowledge being sought in the KG. Nevertheless, signs of complementarity emerge between other, more credible KG patterns and conventional representations of bioactivity data.
In the target-AE association prediction task, the work develops a comprehensive methodological strategy with intended impacts within that problem area and in the broader practice of biomedical KG-based inference. It includes a novel KG processing technique to uncover previously obscured predictive signal in the graph structure. Results against reference methods demonstrate its competitiveness on challenging and diverse sets of target-AE association data.
Metadata
| Supervisors: | Gillet, Val and Kurt, Zeyneb and Zhang, Ziqi |
|---|---|
| Related URLs: | |
| Keywords: | Drug discovery, drug development, knowledge graphs, chemoinformatics, bioinformatics, machine learning, AI, explainable AI, network biology, network medicine, systems biology |
| Awarding institution: | University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
| Date Deposited: | 18 Mar 2026 15:44 |
| Last Modified: | 18 Mar 2026 15:44 |
| Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:38415 |
Download
Final eThesis - complete (pdf)
Embargoed until: 31 December 2026
Please use the button below to request a copy.
Filename: Terence Egbelo PhD Thesis.pdf
Export
Statistics
Please use the 'Request a copy' link(s) in the 'Downloads' section above to request this thesis. This will be sent directly to someone who may authorise access.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.