McKenzie, Thomas (2019) High Frequency Reproduction in Binaural Ambisonic Rendering. PhD thesis, University of York.
Abstract
Humans can localise sounds in all directions using three main auditory cues: the differences in time and level between signals arriving at the left and right eardrums (interaural time difference and interaural level difference, respectively), and the spectral characteristics of the signals due to reflections and diffractions off the body and ears. These auditory cues can be recorded for a position in space using the head-related transfer function (HRTF), and binaural synthesis at this position can then be achieved through convolution of a sound signal with the measured HRTF. However, reproducing soundfields with multiple sources, or at multiple locations, requires a highly dense set of HRTFs. Ambisonics is a spatial audio technology that decomposes a soundfield into a weighted set of directional functions, which can be utilised binaurally in order to spatialise audio at any direction using far fewer HRTFs. A limitation of low-order Ambisonic rendering is poor high frequency reproduction, which reduces the accuracy of the resulting binaural synthesis.
This thesis presents novel HRTF pre-processing techniques, such that when using the augmented HRTFs in the binaural Ambisonic rendering stage, the high frequency reproduction is a closer approximation of direct HRTF rendering. These techniques include Ambisonic Diffuse-Field Equalisation, to improve spectral reproduction over all directions; Ambisonic Directional Bias Equalisation, to further improve spectral reproduction toward a specific direction; and Ambisonic Interaural Level Difference Optimisation, to improve lateralisation and interaural level difference reproduction. Evaluation of the presented techniques compares binaural Ambisonic rendering to direct HRTF rendering numerically, using perceptually motivated spectral difference calculations, auditory cue estimations and localisation prediction models, and perceptually, using listening tests assessing similarity and plausibility. Results conclude that the individual pre-processing techniques produce modest improvements to the high frequency reproduction of binaural Ambisonic rendering, and that using multiple pre-processing techniques can produce cumulative, and statistically significant, improvements.
Metadata
Supervisors: | Kearney, Gavin and Murphy, Damian |
---|---|
Awarding institution: | University of York |
Academic Units: | The University of York > School of Physics, Engineering and Technology (York) |
Academic unit: | Electronic Engineering |
Identification Number/EthosID: | uk.bl.ethos.808717 |
Depositing User: | Mr Thomas McKenzie |
Date Deposited: | 30 Jun 2020 14:30 |
Last Modified: | 21 Mar 2024 15:41 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:26445 |
Download
Examined Thesis (PDF)
Filename: McKenzie_T__PhD_Thesis__University_of_York__2019.pdf
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.