Prosody resources and symbolic prosodic features for automated phrase break prediction

Abstract

It is universally recognised that humans process speech and language in chunks, each meaningful in itself. Any two renditions or assimilations of a given sentence will exhibit similarities and discrepancies in chunking, where speakers and readers use pauses and inflections to mark phrase breaks. This thesis reviews deterministic and stochastic approaches to phrase break prediction, plus datasets, evaluation metrics and feature sets. Early rule-based experimental work with a chunk parser gives rise to motivational insights, namely: the limitations of
traditional features (syntax and punctuation) and deficiency of prosody in current phrasing models, and the problem of evaluating performance when the training set
only represents one phrasing variant. Such insights inform resource creation in the form of ProPOSEL, a prosody and part-of-speech English lexicon, to create a domain-independent knowledge source, plus prosodic annotation and text analytics tool for corpus-based research, supported by a comprehensive software tutorial. Future applications of ProPOSEL include prosody-motivated speech-to-viseme
generation for "talking heads" and expressive avatar creation. Here, ProPOSEL is used to build the ProPOSEC dataset by merging and annotating two versions of the
Spoken English Corpus. Linguistic data arrays in this dataset are first mined for prosodic boundary correlates and later re-conceptualised as training instances for
supervised machine learning. This thesis contends that native English speakers use certain sound patterns (e.g. diphthongs and triphthongs) as linguistic signs for phrase
breaks, having observed these same patterns at rhythmic junctures in poetry. Pre-boundary lexical items bearing these complex vowels and gold-standard boundary
annotations are found to be highly correlated via the chi-squared statistic in different genres, including seventeenth century English verse, and for multiple speakers. Complex vowels and other symbolic prosodic features are then implemented in a phrasing model to evaluate efficacy for phrase break prediction. The ultimate
challenge is to better understand how sound and rhythm, as components of the linguistic sign, inform psycholinguistic chunking even during silent reading.

Metadata

Supervisors:	Atwell, E.
ISBN:	978-0-85731-125-2
Awarding institution:	University of Leeds
Academic Units:	The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds)
Identification Number/EthosID:	uk.bl.ethos.544555
Depositing User:	Repository Administrator
Date Deposited:	11 Jan 2012 12:21
Last Modified:	07 Mar 2014 11:24
Open Archives Initiative ID (OAI ID):	oai:etheses.whiterose.ac.uk:2038

Download

thesisBrierleySeptember2011

Filename: thesisBrierleySeptember2011.pdf

Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License

CLICK TO DOWNLOAD

[thumbnail of thesisBrierleySeptember2011.pdf]

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Altmetric

View Altmetric information about this item.

Prosody resources and symbolic prosodic features for automated phrase break prediction

Abstract

Metadata

Download

thesisBrierleySeptember2011

Export

Statistics