📄 论文解读

AI模型也能分快慢脑：一个秒回，一个深思

趋势通道 ▲ 69 AI模型快慢思考推理能力强化学习模型架构

大模型通常要么快但浅，要么深但慢。这篇报告把AI拆成两个角色：Ling-2.6负责秒回，适合聊天、客服这类需要即时反应的场景；Ring-2.6则专门处理复杂推理，比如写代码、搜索、调用工具。它们共享同一个基础模型，但通过不同的训练方式——Ling用最短正确回答蒸馏，Ring用强化学习在真实环境中练——实现了分工。这不是你明天就能用的技术，但它揭示了一个趋势：未来的AI系统可能不再是一个模型包打天下，而是像人一样，有快思考和慢思考两套系统协同工作。

📄 原文摘要(英文)

Efficient and scalable agentic intelligence requires models that can deliver both low-latency responses and strong reasoning capabilities while remaining practical to train, serve, and deploy. In this report, we present Ling-2.6 and Ring-2.6, a family of models designed to address this challenge at scale. Ling-2.6 is optimized for instant response generation and high capability per output token, whereas Ring-2.6 is tailored for deeper reasoning and more advanced agentic workflows. Instead of training from scratch, we upgrade the Ling-2.0 base model through architectural migration pre-training and large-scale post-training. This upgrade is guided by a unified co-design of model architecture, optimization objectives, serving systems, and agent training environments, enabling improvements in both model capability and deployment efficiency. At the architectural level, we introduce a hybrid linear attention design that integrates Lightning Attention with MLA, improving the efficiency of long-context training and decoding. To further enhance token efficiency, we optimize capability per output token through Evolutionary Chain-of-Thought, Linguistic Unit Policy Optimization, bidirectional preference alignment, and shortest-correct-response distillation. For agentic capabilities, we propose KPop, a reinforcement learning framework designed to support stable training of Ring-2.6-1T on large-scale environment-grounded data. KPop improves training efficiency through asynchronous scheduling across coding, search, tool use, and workflow execution, enabling scalable learning from complex agent-environment interactions. Together, Ling-2.6 and Ring-2.6 provide a practical pathway toward efficient, scalable, and open agentic systems. We open-source all checkpoints in the 2.6 family to support further research and development in practical agentic intelligence.

arXiv 原文

订阅 AI Pulse