AI智能体自己复盘就能变强,无需人工标注
AI智能体(比如帮你写代码、做研究的工具)通常需要一套“技能包”(工具、流程)来解决问题。以前优化这套技能包得靠人工标注正确答案,但现实中很难做到。这篇论文提出RHO方法:让智能体自己回顾过去失败的任务,挑出最难的几个重新尝试,然后通过自我验证和对比,选出更好的技能包。在软件工程测试中,一次优化就让通过率从59%跳到78%,而且不需要任何外部评分。它不是你明天就能直接用的工具,但展示了一个方向:AI可以像人类一样从错误中学习,自我迭代。
📄 原文摘要(英文)
AI agents rely on a harness of skills, tools, and workflows to solve complex problems. Continually improving this harness is essential for adapting to new tasks. However, existing optimization methods typically require ground-truth validation sets, yet such labeled data is difficult to acquire in practical deployment settings. To address this problem, we introduce Retrospective Harness Optimization (RHO), a self-supervised method that optimizes the agent harness using only past trajectories. Specifically, RHO selects a diverse coreset of challenging tasks from past trajectories and re-solves them in parallel. The agent analyzes these rollouts using self-validation and self-consistency, then generates candidate harness updates and selects the most effective one by its own pairwise self-preference. We evaluate RHO across three diverse domains, spanning software engineering, technical work, and knowledge work. Notably, a single optimization round improves the pass rate on SWE-Bench Pro from 59% to 78% without any external grading. Furthermore, our analysis demonstrates that RHO effectively targets prior failure modes. As a result, the optimized harness alters the agent's behavior patterns and sustains higher accuracy during long-horizon sessions.