AI Pulse
📄 论文解读

AI搜索的捷径陷阱:如何让模型真正学会搜索

你让AI去网上搜一个复杂问题的答案,它可能根本没认真搜,而是靠某个关键词或常识直接猜出答案——这就是“捷径”。研究者发现,现有训练数据看似复杂,但模型总能找到偷懒的路径。他们总结出四种典型捷径:比如多个证据指向同一个答案、单个线索就能锁定答案、答案直接藏在问题里、或者靠常识就能蒙对。然后他们设计了一套方法,在生成训练数据时故意堵死这些捷径,迫使模型必须完整搜索才能找到答案。用这种方法训练出的搜索模型,在复杂搜索任务上表现最好,而且只用了监督微调,没用强化学习。它不是你明天就能用的工具,但它揭示了AI搜索能力提升的关键:不是让数据更复杂,而是让数据更“诚实”。

📄 原文摘要(英文)

Training deep search agents requires verifiable questions whose answers remain unavailable until sufficient evidence has been acquired through search. Existing synthesis methods often increase apparent difficulty by enriching graph structures, but structural complexity alone does not guarantee realized search difficulty: the intended search process can collapse through a cheaper identifying route. We formalize this gap with a shortcut-aware difficulty framework and identify four actionable shortcut risks: evidence co-coverage, single-clue selectivity, exposed constants, and prior-knowledge binding. To diagnose their realized effects, we use trajectory signatures including solving cost, answer hit time, and prior-shortcut rate. Guided by this framework, we introduce FORT, a Framework of Shortcut-Resistant Training-Data Synthesis. FORT constructs shortcut-resistant training data by controlling shortcut risks across entity selection, evidence graph construction, question formulation, and adversarial refinement. Experiments show that FORT induces longer pre-answer search and fewer shortcut patterns than existing open-source deep search datasets. Using the resulting trajectories, we train FORT-Searcher with supervised fine-tuning (SFT) only, and it achieves the best overall performance among comparable-size open-source search agents on challenging deep search benchmarks. Relevant resources will be made available at https://github.com/RUCAIBox/FORT-Searcher.

arXiv 原文

📬 订阅 AI Pulse

每天三次更新,不错过重要信号

▲ 回到顶部