Alhinti, Lubna A. ORCID: https://orcid.org/0000-0002-3636-5250 (2021) Dysarthric speech emotion classification. PhD thesis, University of Sheffield.
Abstract
Emotions play critical role in our lives. Communicating emotion is essential in building and maintaining relationships. Misunderstanding them or being unable to express them clearly may lead to problems in communication. People communicate their emotional state not just with the words they use, but also in how they say them. Changes in the rate of speech, energy and pitch all help to convey emotional states like 'angry', 'sad', and 'happy'.
People with dysarthria, the most common speech disorder, have reduced articulatory and phonatory control. This can affect the intelligibility of their speech. However, producing less intelligible speech may not be the only problem affecting their communication; having dysarthria may make it hard to convey emotions in their speech in a way that can be perceived and understood clearly by listeners. Recent research shows some promise on automatically recognising the verbal part of dysarthric speech. However, we know very little about the ability of people with dysarthria to convey their emotional state through nonverbal cues. This thesis investigates the ability of people with dysarthria, caused by cerebral palsy and Parkinson’s disease, to communicate emotions in their speech, and the feasibility to automatically recognise these emotions. Recognising emotions from speech is by itself a challenging problem. In the case of disordered speech, this may exacerbate the problem more as the speakers often have less control of the signifying features.
A survey was designed and distributed to achieve a better understanding of different aspects related to emotion communication by people with dysarthria. A parallel multimodal, dysarthric and typical emotional speech database, which is a first of its kind, was collected and will be made publicly available. The ability of people with dysarthria to make systematic changes to their speech to convey their emotional state is investigated through analysing a set of potential acoustic features which are subsequently compared to those made by typical speakers. Their ability is also assessed perceptually and human listening performance on the collected database is reported. Two main approaches investigating the ability of automatically classifying emotions in dysarthric speech are followed: using models trained on dysarthric (speaker-dependent, matched) and typical (speaker-independent, unmatched) speech. The results of these investigations show it is possible to automatically recognise the emotional state of a speaker with dysarthria with a high degree of accuracy for some speakers.
The work in this thesis shows that despite some speakers with dysarthria having a more limited articulatory and prosodic control, they can make systematic changes in their speech that help in the communication of their emotions. These changes are shown to be successfully perceived by human listeners as well as by automatic emotion recognition models. These findings demonstrate the potential for improved, more expressive voice input communication aids.
Metadata
Download
Final eThesis - complete (pdf)
Filename: Dysarthric Speech Emotion Classification.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.