Al-Rajab, Moaath (2008) Hand gesture recognition for multimedia applications. PhD thesis, University of Leeds.
Abstract
Hand gesture is potentially a very natural and useful modality for human-machine interaction. It is considered to be one of the most complicated and interesting challenges
in computer vision due to its articulated structure and environmental variations. Solving such challenges requires robust hand detection, feature description, and viewpoint invariant classification.
This thesis introduces several steps to tackle these challenges and applies them in a hand-gesture-based application (a game) to demonstrate the proposed approach.
Techniques on new feature description, hand gesture detection and viewpoint invariant methods are explored and evaluated. A normal webcam is used in the research as input
device. Hands are segmented using pre-trained skin colour models and tracked using the CAMShift tracker. Moment invariants are used as a shape descriptor.
A new approach utilising the Zernike Velocity Moments (ZVMs, first introduced by Shutler and Nixon [1,2]), is examined on hand gestures. Results obtained using the
ZVMs as spatial-temporal descriptor are compared to an HMM with Zemike moments (ZMs). Manually isolated hand gestures are used as input to the ZVM descriptor which generates vectors of features that are classified using a regression classifier. The performance of ZVM is evaluated using isolated, user-independent and user-dependent data.
Isolating (segmenting) the gesture manually from a video stream for gesture recognition is a research proposition only and real life scenarios require an automatic hand
gesture detection mechanism. Two methods for detecting gestures are examined. Firstly, hand gesture detection is performed using a sliding window which segments sequences of frames and then evaluates them against pre-trained HMMs. Secondly, the set of class-specific HMMs is combined into a single HMM and the Viterbi algorithm is then used to find the optimal sequence of gestures.
Finally, the thesis proposes a flexible application that provides the user with options to perform the gesture from different viewpoints. A usable hand gesture recognition
system should be able to cope with such viewpoint variations. To solve this problem, a new approach is introduced which makes use of 3D models of hand gestures (not postures) for generating projections. A virtual arm with 3D models of real hands is created. After that, virtual movements of the hand are simulated using animation
software and projected from different viewpoints. Using a multi-Gaussian HMM, the system is trained on the projected sequences. Each set of hand gesture projections is
marked with its specific class and used to train the single multi-class HMNI with gestures across different viewpoints.
Metadata
Supervisors: | Hogg, D. and Ng, K. |
---|---|
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds) |
Identification Number/EthosID: | uk.bl.ethos.502792 |
Depositing User: | Ethos Import |
Date Deposited: | 01 Mar 2010 15:01 |
Last Modified: | 06 Mar 2014 16:54 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:607 |
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.