📄 论文解读

机器人干活时，自己就能提前发现要翻车

信赖通道 ▲ 12 机器人失败检测世界模型长链条任务自我监控

机器人做长链条任务（比如组装、收拾）时，经常中途就失败了，但人类很难实时盯着它。这篇让机器人自己学会“预感”失败：它用一个世界模型来预测每一步之后的状态，如果预测和实际差太多，就说明要出问题。而且训练时只需要告诉它最后成功还是失败，不用人工标注中间步骤。在模拟和真实机器人上（ReactorX-200、Franka）都验证了效果。它不是你明天就能用上的，但方向很明确：让机器人学会自我监控，而不是全靠人类盯着。

📄 原文摘要(英文)

Long-horizon tasks are common in real-world robotic deployments, yet failure detection for such tasks remains underexplored. Detecting failures in long-horizon robotic tasks is particularly challenging because failure onset is often ambiguous and dense temporal annotations are typically unavailable. We present Foresight, a failure detection framework that monitors manipulation trajectories using latent representations from an action-conditioned world model. Foresight is trained using only final task-level success or failure labels. By leveraging predictive world-model embeddings, our method provides a unified framework for failure detection across different policies. We further use functional conformal prediction (FCP) to calibrate detection thresholds adaptively. We evaluate Foresight with state-of-the-art vision-language-action policies in simulation on LIBERO-Long, ManiSkill-Long, and BEHAVIOR-1K, compare it against state-of-the-artfailure detection methods, and validate it on real robots with three long-horizon tasks on a ReactorX-200 arm and one task on a Franka arm. Our results suggest that action-conditioned world-model embeddings provide a scalable representation for reliable failure monitoring in long-horizon manipulation.

arXiv 原文

📬 订阅 AI Pulse