White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Hierarchical Modelling and Recognition of Activities of Daily Living

Tayyub, Jawad (2018) Hierarchical Modelling and Recognition of Activities of Daily Living. PhD thesis, University of Leeds.

Tayyub_J_Computing_PhD_2018.pdf - Final eThesis - complete (pdf)
Available under License Creative Commons Attribution-Noncommercial-Share Alike 2.0 UK: England & Wales.

Download (36Mb) | Preview


Activity recognition is becoming an increasingly important task in artificial intelligence. Successful activity recognition systems must be able to model and recognise activities ranging from simple short activities spanning a few seconds to complex longer activities spanning minutes or hours. We define activities as a set of qualitatively interesting interactions between people, objects and the environment. Accurate activity recognition is a desirable task in many scenarios such as surveillance, smart environments, robotic vision etc. In the domain of robotic vision specifically, there is now an increasing interest in autonomous robots that are able to operate without human intervention for long periods of time. The goal of this research is to build activity recognition approaches for such systems that are able to model and recognise simple short activities as well as complex longer activities arising from long-term autonomous operation of intelligent systems. The research makes the following key contributions: 1. We present a qualitative and quantitative representation to model simple activities as observed by autonomous systems. 2. We present a hierarchical framework to efficiently model complex activities that comprise of many sub-activities at varying levels of granularity. Simple activities are modelled using a discriminative model where a combined feature space, consisting of qualitative and quantitative spatio-temporal features, is generated in order to encode various aspects of the activity. Qualitative features are computed using qualitative spatio-temporal relations between human subjects and objects in order to abstractly represent the simple activity. Unlike current state-of-the-art approaches, our approach uses significantly fewer assumptions and does not require any knowledge about object types, their affordances, or the constituent activities of an activity. The optimal and most discriminating features are then extracted, using an entropy-based feature selection process, to best represent the training data. A novel approach for building models of complex long-term activities is presented as well. The proposed approach builds a hierarchical activity model from mark-up of activities acquired from multiple annotators in a video corpus. Multiple human annotators identify activities at different levels of conceptual granularity. Our method automatically infers a ‘part-of’ hierarchical activity model from this data using semantic similarity of textual annotations and temporal consistency. We then consolidate hierarchical structures learned from different training videos into a generalised hierarchical model represented as an extended grammar describing the over all activity. We then describe an inference mechanism to interpret new instances of activities. Simple short activity classes are first recognised using our previously learned generalised model. Given a test video, simple activities are detected as a stream of temporally complex low-level actions. We then use the learned extended grammar to infer the higher-level activities as a hierarchy over the low-level action input stream. We make use of three publicly available datasets to validate our two approaches of modelling simple to complex activities. These datasets have been annotated by multiple annotators through crowd-sourcing and in-house annotations. They consist of daily activity videos such as ‘cleaning microwave’, ‘having lunch in a restaurant’, ‘working in an office’ etc. The activities in these datasets have all been marked up at multiple levels of abstraction by multiple annotators, however no information on the ‘part-of’ relationship between activities is provided. The complexity of the videos and their annotations allows us to demonstrate the effectiveness of the proposed methods.

Item Type: Thesis (PhD)
Keywords: Activity Recognition, Daily Living Activities, Machine Learning, Artificial Intelligence, Graphical Models, Probabilistic Modelling, Probabilistic Grammar Learning, Inference, Complex and Long Activity Recognition
Academic Units: The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds)
Identification Number/EthosID: uk.bl.ethos.759793
Depositing User: Jawad Tayyub
Date Deposited: 27 Nov 2018 13:04
Last Modified: 18 Feb 2020 12:32
URI: http://etheses.whiterose.ac.uk/id/eprint/22186

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Actions (repository staff only: login required)