White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Visual Feature Learning

Zhu, Fan (2015) Visual Feature Learning. PhD thesis, University of Sheffield.

Text (Thesis: Visual Feature Learning)
Available under License Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 UK: England & Wales.

Download (19Mb) | Preview


Categorization is a fundamental problem of many computer vision applications, e.g., image classification, pedestrian detection and face recognition. The robustness of a categorization system heavily relies on the quality of features, by which data are represented. The prior arts of feature extraction can be concluded in different levels, which, in a bottom up order, are low level features (e.g., pixels and gradients) and middle/high-level features (e.g., the BoW model and sparse coding). Low level features can be directly extracted from images or videos, while middle/high-level features are constructed upon low-level features, and are designed to enhance the capability of categorization systems based on different considerations (e.g., guaranteeing the domain-invariance and improving the discriminative power). This thesis focuses on the study of visual feature learning. Challenges that remain in designing visual features lie in intra-class variation, occlusions, illumination and view-point changes and insufficient prior knowledge. To address these challenges, I present several visual feature learning methods, where these methods cover the following sub-topics: (i) I start by introducing a segmentation-based object recognition system. (ii) When training data are insufficient, I seek data from other resources, which include images or videos in a different domain, actions captured from a different viewpoint and information in a different media form. In order to appropriately transfer such resources into the target categorization system, four transfer learning-based feature learning methods are presented in this section, where both cross-view, cross-domain and cross-modality scenarios are addressed accordingly. (iii) Finally, I present a random-forest based feature fusion method for multi-view action recognition.

Item Type: Thesis (PhD)
Keywords: computer vision, visual feature, submodularity, action recognition, object recognition, transfer learning, dictionary learning, random forest
Academic Units: The University of Sheffield > Faculty of Engineering (Sheffield) > Electronic and Electrical Engineering (Sheffield)
Identification Number/EthosID: uk.bl.ethos.638980
Depositing User: Mr Fan Zhu
Date Deposited: 03 Mar 2015 08:47
Last Modified: 03 Oct 2016 12:09
URI: http://etheses.whiterose.ac.uk/id/eprint/8218

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Actions (repository staff only: login required)