Gaussian Processes for Text Regression

Abstract

Text Regression is the task of modelling and predicting numerical indicators or response variables from textual data. It arises in a range of different problems, from sentiment and emotion analysis to text-based forecasting. Most models in the literature apply simple text representations such as bag-of-words and predict response variables in the form of point estimates. These simplifying assumptions ignore important information coming from the
data such as the underlying uncertainty present in the outputs and the linguistic structure in the textual inputs. The former is particularly important when the response variables come from human annotations while the latter can capture linguistic phenomena that go beyond simple lexical properties of a text.
In this thesis our aim is to advance the state-of-the-art in Text Regression by improving these two aspects, better uncertainty modelling in the response variables and improved text representations. Our main workhorse to achieve these goals is Gaussian Processes (GPs), a Bayesian kernelised probabilistic framework. GP-based regression models the response variables as well-calibrated probability distributions, providing additional information in
predictions which in turn can improve subsequent decision making. They also model the data using kernels, enabling richer representations based on similarity measures between texts.
To be able to reach our main goals we propose new kernels for text which aim at capturing richer linguistic information. These kernels are then parameterised and learned from the data using efficient model selection procedures that are enabled by the GP framework. Finally
we also capitalise on recent advances in the GP literature to better capture uncertainty in the response variables, such as multi-task learning and models that can incorporate non-Gaussian variables through the use of warping functions.
Our proposed architectures are benchmarked in two Text Regression applications: Emotion Analysis and Machine Translation Quality Estimation. Overall we are able to obtain better results compared to baselines while also providing uncertainty estimates for predictions in the form of posterior distributions. Furthermore we show how these models can be probed to obtain insights about the relation between the data and the response variables and also how
to apply predictive distributions in subsequent decision making procedures.

Metadata

Supervisors:	Specia, Lucia and Cohn, Trevor
Awarding institution:	University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Computer Science (Sheffield) The University of Sheffield > Faculty of Science (Sheffield) > Computer Science (Sheffield)
Identification Number/EthosID:	uk.bl.ethos.714362
Depositing User:	Mr Daniel Emilio Beck
Date Deposited:	16 Jun 2017 13:30
Last Modified:	12 Oct 2018 09:40
Open Archives Initiative ID (OAI ID):	oai:etheses.whiterose.ac.uk:17619

Download

thesis

Filename: thesis.pdf

Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License

CLICK TO DOWNLOAD

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Gaussian Processes for Text Regression

Abstract

Metadata

Download

thesis

Export

Statistics