Peng, Xutan ORCID: https://orcid.org/0000-0001-5787-9982 (2022) Embedding Multilingual and Relational Data Using Linear Mappings. PhD thesis, University of Sheffield.
Abstract
This thesis presents our research on the embedding method, a machine learning technique that encodes real-world signals into high-dimensional vectors. Specifically, we focus on a family of algorithms whose backbone is one simple yet elegant type of topological operation, the linear mapping, aka. linear transformation or vector space homomorphism. Past studies have shown the usefulness of these approaches for modelling complex data, such as lexicons from different languages and networks storing factual relations. However, they also exhibit crucial limitations, including a lack of theoretical justifications, precision drop in challenging setups, and considerable environmental impact during training, among others.
To bridge these gaps, we first identify the unnoticed link between the success of linear Cross-Lingual Word Embedding (CLWE) mappings and the preservation of the implicit analogy relation, using both theoretical and empirical evidence. Next, we propose a post-hoc L1-norm rotation step which substantially improves the performance of existing CLWE mappings. Then, beyond solving conventional questions where only modern languages are involved, we extend the application of CLWE mappings to summarising lengthy and opaque historical text. Finally, motivated by the learning procedure of CLWE models, we adopt linear mappings to optimise Knowledge Graph Embeddings (KGEs) iteratively, significantly reducing the carbon footprint required to train the algorithm.
Metadata
Download
Final eThesis - complete (pdf)
Filename: Xutan_PhD_Thesis.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.