📄 论文解读

万亿参数模型也能秒回：Ling/Ring 2.6 的平衡术

信赖通道 ▲ 65 万亿参数混合注意力进化思维链强化学习开源

大模型要么快但笨，要么聪明但慢。Ling-2.6 和 Ring-2.6 是同一家族的两兄弟：Ling 专攻秒级响应，Ring 专攻深度推理。它们不是从零训练，而是从旧模型升级，通过混合注意力机制（闪电注意力+MLA）让长文本训练和推理更快。更关键的是，他们用“进化思维链”和“最短正确回答蒸馏”等技巧，让模型用更少的 token 输出同样好的答案——这意味着更低的成本和更快的速度。Ring 则用一套叫 KPop 的强化学习框架，在编程、搜索、工具调用等真实环境中稳定训练万亿参数模型。这套方案已经开源，不是明天能用上的那种，但它展示了“又快又聪明”不是梦，而是工程取舍的艺术。

📄 原文摘要(英文)

Efficient and scalable agentic intelligence requires models that can deliver both low-latency responses and strong reasoning capabilities while remaining practical to train, serve, and deploy. In this report, we present Ling-2.6 and Ring-2.6, a family of models designed to address this challenge at scale. Ling-2.6 is optimized for instant response generation and high capability per output token, whereas Ring-2.6 is tailored for deeper reasoning and more advanced agentic workflows. Instead of training from scratch, we upgrade the Ling-2.0 base model through architectural migration pre-training and large-scale post-training. This upgrade is guided by a unified co-design of model architecture, optimization objectives, serving systems, and agent training environments, enabling improvements in both model capability and deployment efficiency. At the architectural level, we introduce a hybrid linear attention design that integrates Lightning Attention with MLA, improving the efficiency of long-context training and decoding. To further enhance token efficiency, we optimize capability per output token through Evolutionary Chain-of-Thought, Linguistic Unit Policy Optimization, bidirectional preference alignment, and shortest-correct-response distillation. For agentic capabilities, we propose KPop, a reinforcement learning framework designed to support stable training of Ring-2.6-1T on large-scale environment-grounded data. KPop improves training efficiency through asynchronous scheduling across coding, search, tool use, and workflow execution, enabling scalable learning from complex agent-environment interactions. Together, Ling-2.6 and Ring-2.6 provide a practical pathway toward efficient, scalable, and open agentic systems. We open-source all checkpoints in the 2.6 family to support further research and development in practical agentic intelligence.

arXiv 原文

📬 订阅 AI Pulse