Alrashid, Tarfah (2024) Automatic Detection of Visually Descriptive Language and Scene Boundaries in Narrative Text. PhD thesis, University of Sheffield.
Abstract
The last ten years have seen an explosion of research interest in the cross-over area between computer vision and natural language processing, driven by interest in potential applications such as automatic image description and captioning, image generation from text and automatic text illustration. Such research integrating vision and language often tries to solve image based tasks, by relying on images aligned with texts. However, such aligned data is both noisy and limited in volume. The research reported here starts from the observation
that there is a vast amount of visually descriptive language in text not aligned with images that could potentially be exploited in image-text related applications. To use such language requires us to be able to identify it and to organise it, which leads us to the two exploratory research questions pursed here. First, can we identify visually descriptive language in text? And, second, given that human activity in the world takes places in various types of settings, can we identify such settings in narrative descriptions and can we identify when these settings change?
We pursue these questions through two specific studies which address novel language processing tasks. In our first study, we focus on the task of automatically classifying text
into three classes (visually descriptive, partially descriptive and not visually descriptive) based on the proposed definition of VDL given in Gaizauskas et al. (2015). We perform this task on two levels: sentence level and segment level We use several linguistic and statistical features both separately and in different combinations to observe which perform better in the classification tasks. Our findings show that sentence level classification can be performed
at around 79% accuracy and segment level classification can be achieved at around 81% accuracy.
In our second study, we explore how scenes in narrative text can be automatically identified. We contribute with a small corpus of scene annotated narrative text (ScANT) which to the best of our knowledge, is the first publicly available corpus of English narrative text annotated with scene boundaries. In this study, we develop guidelines to manually annotate the dataset with scene boundary information following SceneML framework (Gaizauskas
and Alrashid, 2019). We also develop automatic scene segmentation models using both
feature engineering and deep learning approaches. In the feature engineering approach we
test the extent to which VDL helps in automatic scene segmentation. Our results show this is
a hard task, with our best performing model, a pre-trained language model, achieving just 58% balanced accuracy on the automatic scene segmentation task.
References
Robert Gaizauskas, Josiah Wang, and Institut De Rob. Defining Visually Descriptive Lan- guage. September, pages 10–17, 2015.
Robert Gaizauskas and Tarfah Alrashid. SceneML: A proposal for annotating scenes in narrative text. In Proceedings of the Fifteenth Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-15), pages 13–21, 2019.
Metadata
Supervisors: | Gaizauskas, Robert |
---|---|
Awarding institution: | University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Computer Science (Sheffield) The University of Sheffield > Faculty of Science (Sheffield) > Computer Science (Sheffield) |
Depositing User: | Miss Tarfah Alrashid |
Date Deposited: | 11 Sep 2024 10:35 |
Last Modified: | 11 Sep 2024 10:35 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:35471 |
Download
Final eThesis - complete (pdf)
Embargoed until: 11 September 2025
Please use the button below to request a copy.
Filename: Tarfah_PhD_Thesis_Corrected_Revised.pdf
Export
Statistics
Please use the 'Request a copy' link(s) in the 'Downloads' section above to request this thesis. This will be sent directly to someone who may authorise access.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.