Rowan, William James ORCID: 0000-0002-0473-5739
(2024)
Towards an intelligent agent for the human face: Improving the accuracy, controllability, and explainability of 3D face reconstruction.
PhD thesis, University of York.
Abstract
The human face is a critical cue for human interaction, playing an essential role in recognition, communication, and even medical diagnosis. However, 3D face reconstruction from a 2D image is an ill-posed problem, with existing approaches struggling due to natural variation in facial appearance and limited 3D data.
This thesis addresses these limitations by proposing the Intelligent Face Agent (IFA), which envisions a new form of computational interaction with human faces. The IFA accepts multi-modal inputs and offers intuitive, text-driven manipulation and explanation of 3D facial reconstructions. We use this concept to motivate the design and implementation of complementary components in 3D face generation and analysis.
First, we introduce the SynthFace Generator, a novel approach for fast, large- scale 2D-3D paired dataset generation. This method eliminates the need for manual asset creation, producing photorealistic face images with paired 3D shapes. We use this method to create SynthFace, the largest paired 2D-3D dataset for human face shape.
Secondly, we present Text2Face, the first method to enable the direct and complete initialisation of 3D face models from textual descriptions. This enhances the controllability of 3D face reconstruction, offering the opportunity to improve practical applications such as avatar creation.
Finally, we introduce new baselines for evaluating 3D face reconstruction methods. We propose OptiFaces, a novel baseline that assesses the performance achievable by accurately classifying a set of well-distributed reference faces, providing a more meaningful interpretation of reconstruction error. Additionally, we introduce "N Heads Are Better Than One", a new approach for evaluating combinations of existing 3D face reconstruction methods, resulting in a range of robust new baselines for 3D face reconstruction.
The research presented improves the accuracy, controllability, and explainability of 3D face reconstruction, paving the way for broader adoption and application in fields such as healthcare, security, and the creative industries.
Metadata
Supervisors: | Huber, Patrik and Pears, Nick and Keeling, Andrew |
---|---|
Related URLs: |
|
Keywords: | 3D Morphable Face Models, 3D Face Reconstruction, Multi-modal Representation Learning. |
Awarding institution: | University of York |
Academic Units: | The University of York > Computer Science (York) |
Depositing User: | Mr Will Rowan |
Date Deposited: | 05 Jun 2025 07:54 |
Last Modified: | 05 Jun 2025 07:54 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:36790 |
Download
Examined Thesis (PDF)
Filename: WRowan_PhDThesisFinal.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.