📄 论文解读

AI 智能体该何时停手？

信赖通道 ▲ 33 智能体弃权可靠性大模型决策

你让 AI 帮你网购、查资料、操作终端，它可能一直瞎忙活，直到耗尽耐心或出错。这篇研究定义了「智能体弃权」问题：AI 不仅要知道答案，更要知道何时该承认「这事办不了」并停止。测试了 13 个模型和 2 个框架在 2.8 万任务上的表现，发现大模型反而更爱死磕——比如在网页购物中，指令看似可行但实际无结果时，大模型会多花好几轮无用操作才放弃。研究者还提出一种方法，把完整交互轨迹提炼成「停手规则」，不更新模型参数就让 Llama-3.3-70B 的及时弃权率从 26.7% 提升到 57.4%。这不是你明天能用上的功能，但它指出了 AI 可靠性的关键缺口：能干活不等于会判断该不该干。

📄 原文摘要(英文)

LLM agents are expected to act over multiple turns, using search, browsing interfaces, and terminal tools to complete user goals. Yet not every goal is well specified or achievable in the available environment. In such cases, a reliable agent should recognize that further interaction is unlikely to help and abstain from additional tool calls. We define Agentic Abstention, the problem of deciding when an agent should stop acting under uncertainty. Unlike standard LLM abstention, which is usually evaluated as a single-turn answer-or-abstain decision, agentic abstention is a sequential decision problem: an agent can answer, abstain, or gather more information at each turn, and the need to abstain may only become clear after interacting with the environment. We study this problem across web shopping, terminal environments, and question answering, evaluating 13 LLM-as-agent systems and 2 agent scaffolds on more than 28,000 tasks. Our results show that the main challenge is not only whether agents can abstain, but also when they abstain. Some agents never abstain when they should, while others do so only after many unnecessary interactions. This gap is especially large on tasks where the instruction appears feasible until the environment reveals otherwise (e.g., no valid result matches the instruction). We further find that model scale, reasoning, and agent scaffolding affect abstention in different ways, where larger or more capable models sometimes perform worse at timely abstention. Finally, we introduce CONVOLVE, a context engineering method for improving agentic abstention that distills full interaction trajectories into reusable stopping rules. On WebShop, CONVOLVE substantially improves timely abstention without updating model parameters, raising Llama-3.3-70B's timely recall rate from 26.7 to 57.4. Our dataset and code are available at https://lhannnn.github.io/agentic-abstention

arXiv 原文

📬 订阅 AI Pulse