Space-aware subword tokenisation and complex word processing in language models

Gow-Smith, Edward (2025) Space-aware subword tokenisation and complex word processing in language models. PhD thesis, University of Sheffield.

Abstract

Metadata

Supervisors: Aline, Villavicencio
Awarding institution: University of Sheffield
Academic Units: The University of Sheffield > Faculty of Engineering (Sheffield) > Computer Science (Sheffield)
Date Deposited: 09 Feb 2026 14:03
Last Modified: 09 Feb 2026 14:03
Open Archives Initiative ID (OAI ID):

Download

Final eThesis - complete (pdf)

Export

Statistics


You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.