Zhen, Xiantong (2013) Feature Extraction and Representation for Human Action Recognition. PhD thesis, University of Sheffield.
Abstract
Human action recognition, as one of the most important topics in computer vision, has been extensively researched during the last decades; however, it is still regarded as
a challenging task especially in realistic scenarios. The difficulties mainly result from the huge intra-class variation, background clutter, occlusions, illumination changes and noise. In this thesis, we aim to enhance human action recognition by feature extraction and representation using both holistic and local methods.
Specifically, we have first proposed three approaches for the holistic representation of actions. In the first approach, we explicitly extract the motion and structure
features from video sequences by converting the video representation into a 2D image representation problem; In the second and third approaches, we treat the video
sequences as 3D volumes and propose to use spatio-temporal pyramid structures to extract multi-scale global features. Gabor filters and steerable filters are extended to
the video domain for holistic representations, which have been demonstrated to be successful for action recognition. With regards to local representations, we have firstly
done a comprehensive evaluation on the local methods including the bag-of-words (BoW) model, sparse coding, match kernels and classifiers based on image-to-class (I2C) distances. Motivated by the findings from the evaluation, we have proposed two distinctive algorithms for discriminative dimensionality reduction of local spatio-temporal descriptors. The first algorithm is based on the image-to-class distances, while the second explores the local Gaussians.
We have evaluated the proposed methods by conducting extensive experiments on widely-used human action datasets including the KTH, the IXMAS, the UCF Sports, the UCF YouTube and the HMDB51 datasets. Experimental results show
the effectiveness of our methods for action recognition.
Metadata
Supervisors: | Shao, Ling |
---|---|
Awarding institution: | University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Electronic and Electrical Engineering (Sheffield) |
Identification Number/EthosID: | uk.bl.ethos.589361 |
Depositing User: | Mr Xiantong Zhen |
Date Deposited: | 12 Feb 2014 15:50 |
Last Modified: | 03 Oct 2016 11:03 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:5141 |
Download
Thesis_ZhenXT_revised_final
Filename: Thesis_ZhenXT_revised_final.pdf
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.