Bassich, Andrea ORCID: https://orcid.org/0000-0002-6978-2050 (2022) Curriculum Learning with a progression function. PhD thesis, University of York.
Abstract
Whenever we, as humans, need to learn a complex task, our learning is usually organised in a specific order: starting from simple concepts and progressing onto more complex ones as our knowledge increases. Likewise, Reinforcement Learning agents can benefit from structure and guidance in their learning. The field of research that studies how to design the agent's training effectively is called Curriculum Learning, and it aims to increase its performance and learning speed.
This thesis introduces a new paradigm for Curriculum Learning based on progression and mapping functions. While progression functions specify the complexity of the environment at any given time, mapping functions generate environments of a specific complexity. This framework does not impose any restriction on the tasks that can be included in the curriculum, and it allows to change the task the agent is training on up to each action.
The problem of creating a curriculum tailored to each agent is explored in the context of the framework. This is achieved through adaptive progression functions, which specify the complexity of the environment based on the agent's performance. Furthermore, a method to progress each dimension independently is defined, and the progression functions derived from our framework are evaluated against state-of-the-art Curriculum Learning methods.
Finally, a novel variation of the Multi-Armed Bandit problem is defined, where a target value is observed at each round, and the arm with the closest expected value to the target is chosen. Based on this framework, we define an algorithm to automate the generation of a mapping function.
The end result of this thesis is a method that is learning algorithm agnostic, is able to translate domain knowledge into an increase in performance (providing similar benefits if such domain knowledge was not available), and can create a fully automated curriculum tailored to each learning agent.
Metadata
Supervisors: | Simos, Gerasimou |
---|---|
Keywords: | Curriculum Learning, Reinforcement Learning |
Awarding institution: | University of York |
Academic Units: | The University of York > Computer Science (York) |
Identification Number/EthosID: | uk.bl.ethos.855797 |
Depositing User: | Mr Andrea Bassich |
Date Deposited: | 06 Jun 2022 13:45 |
Last Modified: | 21 Jun 2022 09:53 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:30671 |
Download
Examined Thesis (PDF)
Filename: Thesis.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.