Alshammeri, Menwa Hayef K ORCID: https://orcid.org/0000-0002-4645-3991 (2022) Deep Learning and Distributional Semantics for the Qur'an. PhD thesis, University of Leeds.
Abstract
This research presents an empirical framework for examining the semantic similarity task in the Qur’an, aiming to promote the acquisition of knowledge from the sacred text. The framework employs recent breakthroughs in feature embedding to encode the verses of the Qur’an and get their embeddings, and then apply a semantic similarity metric to score the relationship between the Qur’anic verses. The framework utilizes deep learning models based on distributional semantics to achieve state-of-the-art semantic textual similarity task results.
Embeddings can be encoded using topic models (like LDA); to encode the vectors with the topic, or learned from encoding using neural networks, such as Word2vec, Doc2vec, and BERT. This research investigates a range of machine learning approaches to modelling semantics of Qur’an verses: Qur’an topic modelling and verse clustering using Latent Dirichlet Allocation; learning Qur’an word meanings using word2vec; learning distributed representations of Qur’an verse meanings using doc2vec; detecting semantic similarity between verses using doc2vec; classifying Qur’an verses using doc2vec; and deep learning of Qur’an verse meaning similarity using BERT and Siamese Transformer architecture.
The most successful novel contribution is the Siamese Transformer model of Qur’an verse similarity. The architecture exploits both the pre-trained contextualized representations for the Arabic language and the Siamese architecture to derive semantically meaningful verse embeddings and achieve impressive results in pairwise semantic similarity detection in the Qur’an. The F1 score of 95% on the Qur’anic semantic similarity test was impressively high.
Performance results obtained by the experiments are significant contributions of this research. The document vector approach proved to be more useful to retrieve the semantically close verses for a given verse. Classifiers and neural networks were trained on top of the derived vectors for classification, regression, and semantic similarity, yielding performance results that are comparable to or better than the reported ones.
Metadata
Supervisors: | Atwell, Eric and Alsalka, Mhd Ammar |
---|---|
Related URLs: | |
Keywords: | The Qur'an, Qur'anic Semantic Similarity, Semantic Similarity, ANLP, NLP, Deep Learning, Distributional Semantics, Verse embeddings |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds) |
Depositing User: | Mrs Menwa Alshammeri |
Date Deposited: | 28 Mar 2023 09:21 |
Last Modified: | 01 Mar 2024 01:06 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:32374 |
Download
Final eThesis - complete (pdf)
Filename: Deep Learning and distributional Semantics for the Quran_PhD Thesis_Menwa alshammeri201278960.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial ShareAlike 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.