Document-Level Machine Translation Quality Estimation

Abstract

Assessing Machine Translation (MT) quality at document level is a challenge as metrics need to account for many linguistic phenomena on different levels. Large units of text encompass different linguistic phenomena and, as a consequence, a machine translated document can have different problems. It is hard for humans to evaluate documents regarding document-wide phenomena (e.g. coherence) as they get easily distracted by problems at other levels (e.g. grammar). Although standard automatic evaluation metrics (e.g. BLEU) are often used for this purpose, they focus on n-grams matches and often disregard document-wide information. Therefore, although such metrics are useful to compare different MT systems, they may not reflect nuances of quality in individual documents.
Machine translated documents can also be evaluated according to the task they will be used for. Methods based on measuring the distance between machine translations and post-edited machine translations are widely used for task-based purposes. Another task-based method is to use reading comprehension questions about the machine translated document, as a proxy of the document quality. Quality Estimation (QE) is an evaluation approach that attempts to predict MT outputs quality, using trained Machine Learning (ML) models. This method is robust because it can consider any type of quality assessment for building the QE models. Thus far, for document-level QE, BLEU-style metrics were used as quality labels, leading to unreliable predictions, as document information is neglected. Challenges of document-level QE encompass the choice of adequate labels for the task, the use of appropriate features for the task and the study of appropriate ML models.
In this thesis we focus on feature engineering, the design of quality labels and the use of ML methods for document-level QE. Our new features can be classified as document-wide (use shallow document information), discourse-aware (use information about discourse structures) and consensus-based (use other machine translations as pseudo-references). New labels are proposed in order to overcome the lack of reliable labels for document-level QE. Two different approaches are proposed: one aimed at MT for assimilation with a low requirement, and another aimed at MT for dissemination with a high quality requirement. The assimilation labels use reading comprehension questions as a proxy of document quality. The dissemination approach uses a two-stage post-editing method to derive the quality labels. Different ML techniques are also explored for the document-level QE task, including the appropriate use of regression or classification and the study of kernel combination to deal with features of different nature (e.g. handcrafted features versus consensus features). We show that, in general, QE models predicting our new labels and using our discourse-aware features are more successful than models predicting automatic evaluation metrics. Regarding ML techniques, no conclusions could be drawn, given that different models performed similarly throughout the different experiments.

Metadata

Supervisors:	Specia, Lucia
Awarding institution:	University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Computer Science (Sheffield) The University of Sheffield > Faculty of Science (Sheffield) > Computer Science (Sheffield)
Identification Number/EthosID:	uk.bl.ethos.707099
Depositing User:	Miss Carolina Scarton
Date Deposited:	30 Mar 2017 13:40
Last Modified:	12 Oct 2018 09:36
Open Archives Initiative ID (OAI ID):	oai:etheses.whiterose.ac.uk:16763

Download

thesis_Carolina_Scarton_final

Filename: thesis_Carolina_Scarton_final.pdf

Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License

CLICK TO DOWNLOAD

[thumbnail of thesis_Carolina_Scarton_final.pdf]

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Document-Level Machine Translation Quality Estimation

Abstract

Metadata

Download

thesis_Carolina_Scarton_final

Export

Statistics