Steele, Kim (2023) Streamlining production workflows for accessible audio using AI metadata assignment. MSc by research thesis, University of York.
Abstract
Advancements in broadcast technology are granting new opportunities to improve listening experiences for consumers. One such advancement is object-based audio and the development of the accessible audio system termed Narrative Importance. This system provides users with an adjustable mix, that boosts sounds which are important to the story, such as dialogue, and attenuates non-essential background sounds, such as crowd chatter. This thesis will target one of the barriers to rolling out the Narrative Importance system - augmented production time due to implementation requirements. The specific focus will be furthering the investigation into whether machine learning can be trained to assign the requisite metadata for Narrative Importance.
A survey is deployed to collect label data for training the machine learning algorithm. Par- ticipants are asked to assign importance data to sounds in a mix for nine scenes. This data is then used to train a mixture model to categorise audio objects into 4 levels of importance. The results show that the method chosen here is not successful in its current form. Training with survey labels proves to be ineffective due to low levels of agreement amongst participants. Training with a single set of labels is shown to give better results.
The question of “What is object-based audio?” was also investigated, partially as a result of differing definitions in the existing literature. A survey of audio research professionals was undertaken. The results show that a robust definition for object-based audio does not exist in writing nor in practice. As a result, a definition is proposed in this thesis.
Metadata
Supervisors: | Kearney, Gavin and Ward, Lauren and Paradis, Matthew |
---|---|
Keywords: | Narrative Importance, machine learning, object-based audio, metadata, AI, artificial intelligence, audio, accessible audio, hearing loss, accessibility, personalisation |
Awarding institution: | University of York |
Academic Units: | The University of York > School of Physics, Engineering and Technology (York) |
Academic unit: | Electronic Engineering |
Depositing User: | Miss Kim Steele |
Date Deposited: | 07 Sep 2023 14:46 |
Last Modified: | 21 Mar 2024 16:13 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:33419 |
Download
Examined Thesis (PDF)
Embargoed until: 7 March 2025
Please use the button below to request a copy.
Filename: Steele_208065713_CorrectedThesisClean.pdf
Export
Statistics
Please use the 'Request a copy' link(s) in the 'Downloads' section above to request this thesis. This will be sent directly to someone who may authorise access.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.