机器人自己写代码学技能,还能跨场景复用
机器人编程通常需要人类专家手写代码处理各种复杂情况。这篇论文让机器人自己写程序、自己试错、自己总结技能。它像一个永不疲倦的学徒:先尝试执行任务,失败后自动分析原因并修复代码,把修复经验提炼成可复用的技能模块,然后不断探索新任务组合来扩展能力库。结果很硬:在多种测试中成功率比现有方法高32%到77%,而且能用积累的技能零样本完成从未见过的长任务(成功率31%对4%)。更关键的是,在仿真中学到的技能可以直接迁移到真实机器人上,大幅减少人工编程工作量。这不是你明天能用的技术,但它指向一个未来:机器人不再需要人类逐条写指令,而是像人一样从实践中学习。
📄 原文摘要(英文)
Traditional robot programming is challenging: it requires orchestrating multimodal perception, managing physical contact dynamics, and handling diverse configurations and execution failures. We introduce ASPIRE (Agentic Skill Programming through Iterative Robot Exploration), a continual learning system that autonomously writes and refines robot control programs in a code-as-policy paradigm while compounding experience into a reusable skill library. ASPIRE discovers skills that persist across tasks, simulation and real-world settings, and embodiments. It operates in an open-ended loop with three components: (1) a closed-loop robot execution engine that exposes fine-grained multimodal traces, enabling autonomous failure diagnosis, repair synthesis, and validation; (2) a continually expanding skill library that distills validated fixes into reusable, transferable knowledge; and (3) evolutionary search that generates diverse task sequences and control programs to explore beyond single-trajectory refinement. ASPIRE surpasses prior methods by up to 77% on LIBERO-Pro manipulation under perturbation, 72% on Robosuite bimanual handover, and 32% on BEHAVIOR-1K long-horizon household tasks. Its accumulated library also enables zero-shot generalization to unseen long-horizon tasks: on LIBERO-Pro Long, ASPIRE achieves 31% success versus 4% for prior methods despite their use of test-time reasoning and retries. Finally, simulation-discovered skills provide initial evidence of sim-to-real transfer, substantially reducing real-robot programming effort across different embodiments and robot APIs.