Bibliographic Details
Title: |
Learning Symbolic Representations for Reinforcement Learning of Non-Markovian Behavior |
Authors: |
Christoffersen, Phillip J. K., Li, Andrew C., Icarte, Rodrigo Toro, McIlraith, Sheila A. |
Publication Year: |
2023 |
Collection: |
Computer Science |
Subject Terms: |
Computer Science - Machine Learning, Computer Science - Artificial Intelligence |
More Details: |
Many real-world reinforcement learning (RL) problems necessitate learning complex, temporally extended behavior that may only receive reward signal when the behavior is completed. If the reward-worthy behavior is known, it can be specified in terms of a non-Markovian reward function - a function that depends on aspects of the state-action history, rather than just the current state and action. Such reward functions yield sparse rewards, necessitating an inordinate number of experiences to find a policy that captures the reward-worthy pattern of behavior. Recent work has leveraged Knowledge Representation (KR) to provide a symbolic abstraction of aspects of the state that summarize reward-relevant properties of the state-action history and support learning a Markovian decomposition of the problem in terms of an automaton over the KR. Providing such a decomposition has been shown to vastly improve learning rates, especially when coupled with algorithms that exploit automaton structure. Nevertheless, such techniques rely on a priori knowledge of the KR. In this work, we explore how to automatically discover useful state abstractions that support learning automata over the state-action history. The result is an end-to-end algorithm that can learn optimal policies with significantly fewer environment samples than state-of-the-art RL on simple non-Markovian domains. Comment: 7 pages, 2 figures, presented at KR2ML workshop at NeurIPS 2020 |
Document Type: |
Working Paper |
Access URL: |
http://arxiv.org/abs/2301.02952 |
Accession Number: |
edsarx.2301.02952 |
Database: |
arXiv |