Ismail, Azniah Binti (2012) Minimally Supervised Techniques for Bilingual Lexicon Extraction. PhD thesis, University of York.
Abstract
Normally, word translations are extracted from non-parallel, bilingual corpora, and initial bilingual lexicon, i.e., a list of known translations, is typically used to aid the learning process. This thesis highlights the study of a series of novel techniques that utilized
scarce resources. To make the study even more challenging,
only minimal use of resources was allowed and important major linguistic tools were not employed. Thus, this study introduces some novel techniques for learning a translation lexicon based on a minimally-supervised, context-based approach. The performance of each technique was measured by comparing the extracted lexicon to a reference lexicon based on the F1 score, which is a weighted average of the precision and the recall. The scores may range from 0 (worst) to 100% (best). Analysis performed on the proposed
techniques showed that these techniques had recorded promising F1 scores, ranging from 57.1% to 80.9%, which indicate moderate and best performances. Overall, the �findings of this study further reinforce the use of techniques in exploiting words from small corpora, suggesting that words that are contextually-relevant and
occurring in a similar domain are potentially useful. This thesis also presents a technique to deploy extra (i.e., additional) data, which are harvested from the web, and a novel method for measuring similarity of features between two words of different languages without involving the use of initial bilingual lexicon.
Metadata
Supervisors: | Manandhar, Suresh |
---|---|
Awarding institution: | University of York |
Academic Units: | The University of York > Computer Science (York) |
Identification Number/EthosID: | uk.bl.ethos.572381 |
Depositing User: | Ms. Azniah Binti Ismail |
Date Deposited: | 29 May 2013 13:15 |
Last Modified: | 08 Sep 2016 13:02 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:3964 |
Download
Thesis_Azniah
Filename: Thesis_Azniah.pdf
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.