Voice Synthesis using the Three-Dimensional Digital Waveguide Mesh

Abstract

The acoustic response of the vocal tract is fundamental to our interpretation of voice production. As an acoustic filter, it shapes the spectral envelope of vocal fold vibration towards resonant modes, or formants, whose behaviours form the most basic building blocks of phonetics.
Physical models of the voice exploit this effect by modelling the nature of wave propagation in abstracted cylindrical constructs. Whilst effective, the accuracy of such approaches is limited due to their limited geometrical analogue. Developments in numerical acoustics modelling meanwhile have seen the formalisation of higher dimensionality configurations of the same technologies, allowing a much closer geometrical representation of an acoustic field. The major focus of this thesis is the application of such a technique to the vocal tract, and comparison of its performance with lower dimensionality approaches.
To afford the development of such models, a body of data is collected from Magnetic Resonance Imaging for a range of subjects, and procedures are developed for the decomposition of this imaging into suitable, efficient data structures for simulation. The simulation technique is exhaustively validated using a combination of bespoke measurement/inversion techniques and analytical determination of lower frequency behaviours.
Finally, voice synthesis based on each numerical model is compared with acoustic recordings of the subjects involved and with equivalent simulations from lower dimensionality methods. It is found that application of a higher dimensionality method typically yields a more accurate frequency-domain representation of the voice, although in some cases lower dimensionality equivalents are seen to perform better at low frequencies.

Metadata

Supervisors:	Howard, David and Murphy, Damian
Keywords:	Voice Synthesis, Digital Waveguide, Vocal Tract, Acoustics Simulation
Awarding institution:	University of York
Academic Units:	The University of York > School of Physics, Engineering and Technology (York)
Academic unit:	Department of Electronics
Identification Number/EthosID:	uk.bl.ethos.557241
Depositing User:	Mr Matthew DA Speed
Date Deposited:	04 Oct 2012 13:40
Last Modified:	21 Mar 2024 14:25
Open Archives Initiative ID (OAI ID):	oai:etheses.whiterose.ac.uk:2800

Download

Speed_Thesis_Corrected

Filename: Speed_Thesis_Corrected.pdf

Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License

CLICK TO DOWNLOAD

[thumbnail of Speed_Thesis_Corrected.pdf]

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Voice Synthesis using the Three-Dimensional Digital Waveguide Mesh

Abstract

Metadata

Download

Speed_Thesis_Corrected

Export

Statistics