SkillWrapper: Generative Predicate Invention for Task-level Planning

1Department of Computer Science, Brown University, 2Allen Institute of AI

*Indicates Equal Contribution

Robot manipulation experiments using Franka and bi-manual Kuka arms.

Abstract

Generalizing from individual skill executions to solving long-horizon tasks remains a core challenge in building autonomous robots. A promising direction is learning high-level, symbolic representations of the low-level skills of the robots, enabling reasoning and planning independent of the low-level state space. Recent advances in foundation models have made it possible to generate symbolic predicates that operate on raw sensory inputs—a process we call generative predicate invention—to facilitate downstream representation learning. However, it remains unclear which formal properties the learned representations must satisfy, and how they can be learned to guarantee these properties. In this paper, we address both questions by presenting a formal theory of generative predicate invention for task-level planning, resulting in symbolic operators useful for provably sound and complete planning. We propose SkillWrapper, a method that leverages foundation models to actively collect robot data and learn human-interpretable, plannable representations of black-box skills, using only RGB image observations. Our extensive empirical evaluation in simulation and on real robots shows that SkillWrapper learns abstract representations that enable solving unseen, long-horizon tasks in the real world with black-box skills.

Overview

We introduce SkillWrapper, a novel approach that autonomously learns symbolic representations for black-box skills while providing several guarantees such as soundness and completeness (see Appendix B for further theoretical details). To produce a valid abstract model that enables planning, SkillWrapper iterates through a threestep process: (1) actively proposing and executing exploratory skill sequences to collect data on the initiation and termination set of each skill, (2) incrementally building a set of predicates from scratch by contrasting positive and negative examples, and then (3) constructing valid operators using these invented predicates, from which further exploratory skill sequences can be proposed.

Experiments

To demonstrate the applicability of SkillWrapper for realworld robotic settings, we designed two sets of experiments with two robotic platforms: a Franka Emika Panda robot and a bimanual platform with two Kuka iiwa robots. Our results demonstrate that SkillWrapper is effective in real robot settings: our method generalizes skill representations learned in restricted domains to richer environments and progressively improves in more challenging scenarios with irreversible actions and interdependent skills. By outperforming all baseline methods, SkillWrapper highlights the importance of predicate invention and iterative learning to scale symbolic representations to embodied tasks.

Results of Generalization Experiment in Franka Domain

Description of first image
Description of second image

Contributions

We highlight the following contributions of SkillWrapper: (1) A formal theory of generative predicate invention for provably sound and complete representations; (2) SkillWrapper, a principled system built on this framework that leverages foundation models to learn interpretable symbolic representations of black-box skills; and (3) an extensive empirical evaluation of the system in simulated and real robotic platforms.