Cai, Qingdong
ORCID: https://orcid.org/0009-0008-0437-8664
(2025)
Image-Level Weakly Supervised Semantic Segmentation Based on Novel Class Activation Mapping.
PhD thesis, University of Sheffield.
Abstract
Weakly Supervised Semantic Segmentation (WSSS) is a vital area in computer vision that focuses on training segmentation models without relying on detailed, pixel-level annotations. Currently, most WSSS methods complete this task by leveraging Class Activation Mapping (CAM). However, existing CAM methods tend to highlight only the most discriminative regions of objects, often resulting in pseudo semantic segmentation labels that fail to cover the entire object and lack spatial accuracy. This limitation leads to incomplete learning of target features by the subsequent semantic segmentation network.
To address these challenges, this study introduces a novel CAM algorithm, Region-CAM, which enhances the quality of activation maps to capturing more comprehensive and detailed object regions by treating gradients and features separately based on our one hypothesis and a finding. Moreover, since Region-CAM is a method for processing the gradient of network features, it can be flexibly integrated with existing WSSS methods to achieve better performance.
Furthermore, existing evaluation metrics mainly focus on evaluating the faithfulness and interpretability of activation algorithms, rather than the accuracy of activation maps. To fill this gap, this research proposes a novel evaluation metric, mean Mask Intersection Energy (mMIE), that accurately assess the quality of activations method that is beneficial to downstream tasks.
Finally, a specialized WSSS network for the food domain is proposed. This study is the first attempt to explore WSSS in the food domain, reducing the reliance on extensive annotations. A novel single-stage dual-branch network (SSDB-Net) is proposed to improve the performance of food WSSS tasks based on image-level annotations. We find that the performance of the food WSSS task at this stage is mainly limited by the Recall of network. Region-CAM is also applied to the proposed network to further demonstrate its effectiveness.
Metadata
| Supervisors: | Abhayaratne, Charith |
|---|---|
| Related URLs: | |
| Keywords: | Weakly supervised semantic segmentation, semantic segmentation, food weakly supervised semantic segmentation, class activation mapping (CAM), evaluation metrics. |
| Awarding institution: | University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Electronic and Electrical Engineering (Sheffield) |
| Date Deposited: | 19 Jan 2026 10:00 |
| Last Modified: | 19 Jan 2026 10:00 |
| Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:38036 |
Download
Final eThesis - complete (pdf)
Filename: Image-Level Weakly Supervised Semantic Segmentation Based on Novel Class Activation Mapping.pdf
Licence:

This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.