Luna Gutierrez, Ricardo (2022) Efficient Meta-Reinforcement Learning. PhD thesis, University of Leeds.
Abstract
In Meta-Reinforcement Learning (meta-RL) agents are trained on a set of tasks to prepare for and learn faster in new, unseen, but related tasks. The standard practice to build training sets in meta-RL is to use dense coverage of task distributions, generating a very large set of training tasks.
This thesis introduces a novel framework for meta-RL, in which models have access to a limited number of tasks to train on. With this framework in mind we propose task selection methods as well as an application that can benefit from it.
We introduce ITTS, a task selection method that select tasks that are different from one another and relevant a set of tasks sampled from the target distribution. The output is a smaller training set which can be learnt faster and performs better than training with all the available tasks. We experimentally evaluate the performance of ITTS in a variety of domains and show that ITTS improves the final performance of the agents in all of them.
We build insight on the learnt behaviours by meta-RL and propose FETA, a task selection method that improves over ITTS. FETA is a simpler and more cost
efficient tasks selection method that filters tasks by taking advantage of policy transfer between tasks. We experimentally evaluate FETA and demonstrate that
even tough the task selection process is more efficient, FETA performs equally or better than ITTS.
Finally, we make the first connection between meta-RL and heuristic planning, showing that heuristic functions meta-learned from planning problems can outperform both popular domain-independent heuristics and heuristics learned by supervised learning.
Metadata
Supervisors: | Cohn, Anthony and Matteo, Leonetti |
---|---|
Keywords: | Meta-Reinforcement Learning, Task Selection |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds) |
Identification Number/EthosID: | uk.bl.ethos.855646 |
Depositing User: | Ricardo Luna Gutierrez |
Date Deposited: | 15 Jun 2022 11:09 |
Last Modified: | 11 Jul 2022 09:53 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:30497 |
Download
Final eThesis - complete (pdf)
Filename: University_of_Leeds_Thesis.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial ShareAlike 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.