SkillWrapper: Autonomously Learning Interpretable Skill Abstractions with Foundation Models

We envision a future where robots are equipped “out of the box” with a library of general-purpose skills. To effectively compose these skills into long-horizon plans, a robot must understand each skill’s preconditions and effects in a form that supports symbolic reasoning. Such representations should be human-interpretable so that robots may understand human commands and humans may understand robot capabilities. Unfortunately, existing approaches to skill abstraction learning often require extensive data collection or human intervention, and typically yield uninterpretable representations. We present SkillWrapper, the first known active learning approach that leverages foundation models to learn human-interpretable abstractions of black-box robot skills, producing representations that are both probabilistically complete and suitable for planning. Given only RGB image observations before and after skill execution, our system actively collects data, invents symbolic predicates, and constructs PDDL-style operators to model the skills. We present preliminary simulation results demonstrating that the abstract representations learned by SkillWrapper can be used to solve previously unseen, long-horizon tasks.

SkillWrapper: Autonomously Learning Interpretable Skill Abstractions with Foundation Models

Abstract

Overview

Contributions