Sosa Martinez, Jose Angel ORCID: https://orcid.org/0000-0002-7839-8368 (2023) Self-supervised pose estimation. PhD thesis, University of Leeds.
Abstract
Human and non-human pose estimation has been studied within the computer vision community for many decades. The progress made within this area has permitted its application to solve multiple tasks, for example, human activity recognition, animal tracking, video surveillance, autonomous driving, and behaviour analysis. Despite the tremendous advancement in developing methods and creating datasets for pose estimation tasks, there remains a lack of tools that work with minimal assumptions about data availability. In other words, most state-of-the-art approaches for pose estimation heavily rely on large datasets containing 2D or 3D annotations used during the training phase. This could make their adaptation to other domains challenging, particularly to the animal domain, where 2D and 3D annotations are scarce.
Throughout the chapters of this thesis, we explore developing and adapting self-supervised deep learning methods for both 2D and 3D pose estimation. Our focus is on creating methods that require minimal or no annotated data for training. This approach provides flexibility in the resulting methods, allowing these to work with diverse skeletal structures with little to no effort in the adaptation process. We start working in this direction by adapting a 2D human pose estimation model to the animal domain. To achieve this, we incorporate a prior of synthetically generated 2D poses, allowing self-supervised training and eliminating the need for manual annotations of input images. We apply this method to explore unlabelled data, as demonstrated by our successful implementation using a dataset of recordings featuring genetically modified mice. Similarly, our proposal in the human domain involves developing a self-supervised method for estimating 3D poses directly from images. Unlike previous works dealing with the same task, our approach requires no 3D annotations for training. Our method builds upon ideas from recent human pose estimation literature and adopts elements from our mice pose estimator. This makes the formulation work with only unlabelled images and an unpaired prior of 2D poses for training. We further experiment with adapting this method to different conditions and body structures. Ultimately, we demonstrate that it also works well for a different skeletal structure and when utilising a prior of 2D poses generated through synthetic data rather than relying on annotations from existing datasets.
Metadata
Supervisors: | Hogg, David |
---|---|
Related URLs: |
|
Keywords: | pose estimation, self-supervised, unlabelled data |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds) |
Depositing User: | Mr Jose Angel Sosa Martinez |
Date Deposited: | 06 Dec 2023 14:48 |
Last Modified: | 06 Dec 2023 14:48 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:33918 |
Download
Final eThesis - complete (pdf)
Filename: thesis_corrections_done.pdf
Description: Thesis
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial ShareAlike 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.