BENJAMIN, TSUI ORCID: 0000-0001-9863-5845 (2023) HRTF Generation for Data Demanding Machine Learning Algorithms. PhD thesis, University of York.
Abstract
This thesis investigates the application of Machine Learning (ML) techniques to binaural audio research. Whilst there is plenty of work done in this domain currently, much of it is limited by the amount of available Head Related Transfer Function (HRTF) data required to train modern neural network-based ML models, resulting in researchers using a less data-driven approach or finding some workaround with the limited data. This thesis focuses on the generation of enough data to unleash the power of a wide variety of modern ML algorithms. A novel method is presented that can simulate unlimited realistic HRTFs using heads generated from Three-dimensional Morphable Models (3DMMs). The result has led to the creation of the HUman Morphable Model- based Numerically Generated Binaural Impulse Response Database (HUMMNGBIRD) database, created with the first 5000 HRTF sets generated by this method. Principle Component Analysis (PCA) and Variational Auto-Encoder (VAE) reconstruction models were created to investigate the potential of such a large amount of data. The results provide valuable insights into the research directions that could make good use of these types of artificially generated databases in the near future.
Metadata
Supervisors: | Gavin, Kearney and William, Smith |
---|---|
Keywords: | Machine Learning, Binaural Audio, Audio, Database, Head Related Transfer Function |
Awarding institution: | University of York |
Academic Units: | The University of York > School of Physics, Engineering and Technology (York) |
Academic unit: | Electronic Engineering |
Depositing User: | Mr Benjamin Tsui |
Date Deposited: | 07 Sep 2023 15:02 |
Last Modified: | 21 Mar 2024 16:13 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:33460 |
Download
Examined Thesis (PDF)
Embargoed until: 7 September 2024
Please use the button below to request a copy.
Filename: Tsui_203030689_FinalTitleUpdated.pdf
Export
Statistics
Please use the 'Request a copy' link(s) in the 'Downloads' section above to request this thesis. This will be sent directly to someone who may authorise access.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.