Hardy, Hardy ORCID: https://orcid.org/0000-0003-1825-1652 (2021) Guiding Abstractive Summarization using Structural Information. PhD thesis, University of Sheffield.
Abstract
Abstractive summarization takes a set of sentences from a source document and reproduces its salient information using the summarizer's own words into a summary. Produced summaries may contain novel words and have different grammatical structures from the source document. In a sense, abstractive summarization is closer to how a human summarizes, yet it is also more difficult to automate since it requires a full understanding of the natural language. However, with the inception of deep learning, many new summarization systems achieved improved automatic and manual evaluation scores. One prominent deep learning model is the sequence-to-sequence model with an attention-based mechanism. Moreover, the advent of pre-trained language models over a huge set of unlabeled data further improved the performance of a summarization system. However, with all the said improvements, abstractive summarization is still adversely affected by hallucination and disfluency. Furthermore, all these recent works that used a seq2seq model require a large dataset since the underlying neural network easily overfits on a small dataset resulting in a poor approximation and high variance outputs. The problem is that these large datasets often came with only a single reference summary for each source document despite that it is known that human annotators are subject to a certain degree of subjectivity when writing a summary.
We addressed the first problem by using a mechanism where the model uses a guidance signal to control what tokens are to be generated. A guidance signal can be defined as different types of signals that are fed into the model in addition to the source document where a commonly used one is structural information from the source document. Recent approaches showed good results using this approach, however, they were using a joint-training approach for the guiding mechanism, in other words, the model needs to be re-trained if a different guidance signal is used which is costly. We propose approaches that work without re-training and therefore are more flexible with regards to the guidance signal source and also computationally cheaper. We performed two different experiments where the first one is a novel guided mechanism that extends previous work on abstractive summarization using Abstract Meaning Representation (AMR) with a neural language generation stage which we guide using side information. Results showed that our approach improves over a strong baseline by 2 ROUGE-2 points. The second experiment is a guided key-phrase extractor for more informative summarization. This experiment showed mixed results, but we provide an analysis of the negative and positive output examples.
The second problem was addressed by our proposed manual evaluation framework called Highlight-based Reference-less Evaluation Summarization (HighRES). The proposed framework avoids reference bias and provides absolute instead of ranked evaluation of the systems. To validate our approach we employed crowd-workers to augment with highlights on the eXtreme SUMmarization (XSUM) dataset which is a highly abstractive summarization dataset. We then compared two abstractive systems (Pointer Generator and T-Conv) to demonstrate our approach. Results showed that HighRES improves inter-annotator agreement in comparison to using the source document directly, while it also emphasizes differences among systems that would be ignored under other evaluation approaches. Our work also produces annotated dataset which gives more understanding on how humans select salient information from the source document.
Metadata
Supervisors: | Nikolaos, Aletras |
---|---|
Keywords: | natural language processing; automatic summarization; human evaluation |
Awarding institution: | University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Computer Science (Sheffield) The University of Sheffield > Faculty of Science (Sheffield) > Computer Science (Sheffield) |
Identification Number/EthosID: | uk.bl.ethos.848099 |
Depositing User: | Mr. Hardy Hardy |
Date Deposited: | 22 Feb 2022 19:22 |
Last Modified: | 01 Apr 2022 09:53 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:30188 |
Download
Final eThesis - complete (pdf)
Filename: Final_Report_Revised___Hardy.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.