Semi-Supervised One-Shot Imitation Learning
Title: | Semi-Supervised One-Shot Imitation Learning |
---|---|
Authors: | Wu, Philipp, Hakhamaneshi, Kourosh, Du, Yuqing, Mordatch, Igor, Rajeswaran, Aravind, Abbeel, Pieter |
Source: | Reinforcement Learning Journal 1 (2024) |
Publication Year: | 2024 |
Collection: | Computer Science |
Subject Terms: | Computer Science - Machine Learning, Computer Science - Artificial Intelligence |
More Details: | One-shot Imitation Learning~(OSIL) aims to imbue AI agents with the ability to learn a new task from a single demonstration. To supervise the learning, OSIL typically requires a prohibitively large number of paired expert demonstrations -- i.e. trajectories corresponding to different variations of the same semantic task. To overcome this limitation, we introduce the semi-supervised OSIL problem setting, where the learning agent is presented with a large dataset of trajectories with no task labels (i.e. an unpaired dataset), along with a small dataset of multiple demonstrations per semantic task (i.e. a paired dataset). This presents a more realistic and practical embodiment of few-shot learning and requires the agent to effectively leverage weak supervision from a large dataset of trajectories. Subsequently, we develop an algorithm specifically applicable to this semi-supervised OSIL setting. Our approach first learns an embedding space where different tasks cluster uniquely. We utilize this embedding space and the clustering it supports to self-generate pairings between trajectories in the large unpaired dataset. Through empirical results on simulated control tasks, we demonstrate that OSIL models trained on such self-generated pairings are competitive with OSIL models trained with ground-truth labels, presenting a major advancement in the label-efficiency of OSIL. |
Document Type: | Working Paper |
Access URL: | http://arxiv.org/abs/2408.05285 |
Accession Number: | edsarx.2408.05285 |
Database: | arXiv |
Be the first to leave a comment!