Learning Symbolic Representations for Reinforcement Learning of Non-Markovian Behavior

Bibliographic Details
Title:	Learning Symbolic Representations for Reinforcement Learning of Non-Markovian Behavior
Authors:	Christoffersen, Phillip J. K., Li, Andrew C., Icarte, Rodrigo Toro, McIlraith, Sheila A.
Publication Year:	2023
Collection:	Computer Science
Subject Terms:	Computer Science - Machine Learning, Computer Science - Artificial Intelligence
More Details:	Many real-world reinforcement learning (RL) problems necessitate learning complex, temporally extended behavior that may only receive reward signal when the behavior is completed. If the reward-worthy behavior is known, it can be specified in terms of a non-Markovian reward function - a function that depends on aspects of the state-action history, rather than just the current state and action. Such reward functions yield sparse rewards, necessitating an inordinate number of experiences to find a policy that captures the reward-worthy pattern of behavior. Recent work has leveraged Knowledge Representation (KR) to provide a symbolic abstraction of aspects of the state that summarize reward-relevant properties of the state-action history and support learning a Markovian decomposition of the problem in terms of an automaton over the KR. Providing such a decomposition has been shown to vastly improve learning rates, especially when coupled with algorithms that exploit automaton structure. Nevertheless, such techniques rely on a priori knowledge of the KR. In this work, we explore how to automatically discover useful state abstractions that support learning automata over the state-action history. The result is an end-to-end algorithm that can learn optimal policies with significantly fewer environment samples than state-of-the-art RL on simple non-Markovian domains. Comment: 7 pages, 2 figures, presented at KR2ML workshop at NeurIPS 2020
Document Type:	Working Paper
Access URL:	http://arxiv.org/abs/2301.02952
Accession Number:	edsarx.2301.02952
Database:	arXiv

More Details
Description not available.