Domain adaptation for pedestrian detection

Abstract

Object detection is an essential component of many computer vision systems. The increase in the amount of collected digital data and new applications of computer vision have generated a demand for object detectors for many different types of scenes digitally captured in diverse settings. The appearance of objects captured across these different scenarios can vary significantly, causing readily available state-of-the-art object detectors to perform poorly in many of the scenes. One solution is to annotate and collect labelled data for each new scene and train a scene-specific object detector that is specialised to perform well for that scene, but such a method is labour intensive and impractical.
In this thesis, we propose three novel contributions to learn scene-specific pedestrian detectors for scenes with minimal human supervision effort. In the first and second contributions, we formulate the problem as unsupervised domain adaptation in which a readily available generic pedestrian detector is automatically adapted to specific scenes (without any labelled data from the scenes). In the third contribution, we formulate it as a weakly supervised learning algorithm requiring annotations of only pedestrian centres.
The first contribution is a detector adaptation algorithm using joint dataset feature learning. We use state-of-the-art deep learning for the purpose of detector adaptation by exploiting the assumption that the data lies on a low dimensional manifold. The algorithm significantly outperforms a state-of-the-art approach that makes use of a similar manifold assumption.
The second contribution presents an efficient detector adaptation algorithm that makes effective use of cues (e.g spatio-temporal constraints) available in video. We show that, for videos, such cues can dramatically help with the detector adaptation. We extensively compare our approach with state-of-the-art algorithms and show that our algorithm outperforms the competing approaches despite being simpler to implement and apply.
In the third contribution, we approach the task of reducing manual annotation effort by formulating the problem as a weakly supervised learning algorithm that requires annotation of only approximate centres of pedestrians (instead of the usual precise bounding boxes). Instead of assuming the availability of a generic detector and adapting it to new scenes as in the first two contributions, we collect manual annotation for new scenes but make the annotation task easier and faster. Our algorithm reduces the amount of manual annotation effort by approximately four times while maintaining a similar detection performance as the standard training methods. We evaluate each of the proposed algorithms on two challenging publicly available video datasets.

Metadata

Supervisors:	Hogg, David
Keywords:	Domain adaptation, transfer learning, domain adaptation for videos, pedestrian detector adaptation, pedestrian detection
Awarding institution:	University of Leeds
Academic Units:	The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds)
Identification Number/EthosID:	uk.bl.ethos.634272
Depositing User:	Kyaw Kyaw Htike
Date Deposited:	10 Feb 2015 10:29
Last Modified:	18 Feb 2020 12:47
Open Archives Initiative ID (OAI ID):	oai:etheses.whiterose.ac.uk:7290

Download

Final eThesis - complete (pdf)

Filename: thesis_final.pdf

Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License

CLICK TO DOWNLOAD

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Domain adaptation for pedestrian detection

Abstract

Metadata

Download

Final eThesis - complete (pdf)

Export

Statistics