Tabakhi, Sina
ORCID: 0000-0002-3075-7907
(2026)
Multimodal learning with graphs for multiomics cancer classification.
PhD thesis, University of Sheffield.
Abstract
With advances in high-throughput technologies, multiple high-dimensional molecular modalities, known as multiomics, have become increasingly available, offering complementary insights into cancer biology. Graph-based multimodal learning has shown remarkable potential in integrating these modalities to unravel cancer complexity, enhance biological predictions, and facilitate biomarker discovery. Despite their promise, these models face three challenges. First, they struggle with small patient cohorts and high-dimensional features, often applying independent feature selection without capturing relationships across omics modalities. Second, conventional graph-based models rely on homogeneous graphs that cannot represent multiple node and edge types. Third, handling missing modalities remains another open problem, as the number of missing patterns increases exponentially with the number of modalities.
This thesis proposes four graph-based multimodal learning models designed to improve accuracy, interpretability, and biomarker discovery in cancer diagnosis. The first model develops a multimodal feature selection method based on a multi-agent system that captures both intra- and inter-omics interactions. Building on this, the second model introduces the automatic construction of heterogeneous graphs from multiomics data to learn holistic, omics-specific representations. To handle datasets with missing modalities, the third model presents a direct prediction approach for partial modalities by introducing a patient-modality multi-head attention mechanism, whose complexity increases linearly with the number of modalities while adapting to missing-pattern variability. Finally, the fourth model extends the proposed multimodal feature selection method to handle missing modalities and is applied to a use case investigating the effects of diet-induced obesity and metformin treatment on molecular changes in mice. Comprehensive experiments show the superior performance of the proposed methods on real-world cancer datasets and their effectiveness in identifying biologically meaningful biomarkers.
Metadata
| Supervisors: | Lu, Haiping |
|---|---|
| Keywords: | Machine learning, multimodal learning, multiomics integration, cancer classification, missing modalities, feature selection, graph neural networks, heterogeneous graphs |
| Awarding institution: | University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Computer Science (Sheffield) The University of Sheffield > Faculty of Engineering (Sheffield) |
| Date Deposited: | 20 Apr 2026 07:46 |
| Last Modified: | 20 Apr 2026 07:46 |
| Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:38611 |
Download
Final eThesis - complete (pdf)
Filename: PhD_Thesis_Tabakhi_Sina.pdf
Licence:

This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.