Greenall, John Patrick (2012) High-level activity learning and recognition in structured environments. PhD thesis, University of Leeds.
Abstract
Automatic recognition of events in video is an immensly challenging problem. If solved, the number of potential domains in which such a system could be deployed is
vast and growing; including traffic monitoring, surveillance, security, elderly care and semantic video search to name but a few. Much prior research in the area has focused on producing a solution that is tailored towards one of these applications, applying methods
which are most appropriate given the constraints of the target domain. For the moment, this remains to some extent the only practical way to approach the problem. The aim in
this thesis is to build a high-level framework for event recognition which is in the main generic and widely transferrable, yet allows domain-appropriate elements to be incorporated.
A detector is constructed for low-level events which is based on dense extraction of Histograms of Optical Flow. This descriptor has only recently been adopted by the event
detection community, and as such there are aspects of the features which have not been optimized. This thesis performs extensive experimentation on normalization scheme and finds that the strategy most widely in use is suboptimal compared to one of the alternatives proposed. The detector is then trained on a challenging real world domain to run in a sliding window fashion on continuous video input.
A high level model which exploits temporal relations between different event types is constructed. The model is designed with transferrability and computational tractability in mind. Several methods are benchmarked for learning the distributions over time differences between pairs of events. Three different connection strategies are proposed and evaluated for creating a tree structured prior that permits fast, exact inference. An efficient iterative optimization scheme is presented for handling scenarios which contain unknown numbers of event instances. Finally, the model is extended in a Conditional Random Field framework that allows weights to be learned to balance the response from independent detectors with the pairwise temporal relationships.
Metadata
Supervisors: | Cohn, A. and Hogg, D. |
---|---|
ISBN: | 978-0-85731-270-9 |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds) |
Identification Number/EthosID: | uk.bl.ethos.564189 |
Depositing User: | Repository Administrator |
Date Deposited: | 03 Jan 2013 11:59 |
Last Modified: | 07 Mar 2014 11:24 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:3231 |
Download
thesis
Filename: thesis.pdf
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.