Sridhar, Muralikrishna (2010) Unsupervised learning of event and object classes from video. PhD thesis, University of Leeds.
Abstract
We present a method for unsupervised learning of event classes from videos in which multiple activities may occur simultaneously. Unsupervised discovery of event classes
avoids the need to hand-crafted event classes and thereby makes it possible in principle to scale-up to the huge number of event classes that occur in the real world. Research into an unsupervised approach has important consequences for tasks such as video understanding
and summarization, modelling usual and unusual behaviour and video indexing for retrieval. These tasks are becoming increasingly important for scenarios such as surveillance,
video search, robotic vision and sports highlights extraction as a consequence of the increasing proliferation of videos.
The proposed approach is underpinned by a generative probabilistic model for events and a graphical representation for the qualitative spatial relationships between objects and their temporal evolution. Given a set of tracks for the objects within a scene, a set of event classes is derived from the most likely decomposition of the ‘activity graph’ of spatio-temporal relationships between all pairs of objects into a set of labelled events
involving subsets of these objects.
The posterior probability of candidate solutions favours decompositions in which events of the same class have a similar relational structure, together with three other measures of well-formedness. A Markov Chain Monte Carlo (MCMC) procedure is used to efficiently search for the MAP solution. This search moves between possible decompositions
of the activity graph into sets of unlabelled events and at each move adds a close to optimal labellings (for this decomposition) using spectral clustering.
Experiments on simulated and real data show that the discovered event classes are often semantically meaningful and correspond well with ground-truth event classes assigned
by hand.
Event Learning is followed by learning of functional object categories. Equivalence classes of objects are discovered on the basis of their similar functional role in multiple
event instantiations. Objects are represented in a multidimensional space that captures their functional role in all the events. Unsupervised learning in this space results in functional object-categories.
Experiments in the domain of aircraft handling suggests that our spatio-temporal representation together with the learning techniques are a promising framework for learning
functional object-categories from video.
Metadata
Supervisors: | Cohn, Anthony and Hogg, David |
---|---|
ISBN: | 978-0-85731-049-1 |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds) |
Identification Number/EthosID: | uk.bl.ethos.557358 |
Depositing User: | Ethos Import |
Date Deposited: | 15 Aug 2012 15:22 |
Last Modified: | 24 Aug 2020 06:37 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:2621 |
Download
Final eThesis - complete (pdf)
Filename: M_Sridhar_PhDThesis_Dec2010.pdf
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.