AI Pulse
📄 论文解读

推荐系统不再“左右互搏”:一个Transformer搞定所有任务

推荐系统通常要同时预测多个目标(比如点击、收藏、购买),但不同目标之间会互相干扰,导致“按下葫芦浮起瓢”——优化了点击率,购买率反而下降。传统做法把特征提取和任务预测分开,就像让一个翻译同时给两个不同语种的人传话,信息在中间丢失。OneRank 把整个过程整合进一个 Transformer 模型,让每个任务有自己的“专用通道”,只在必要时共享信息,同时用动态匹配打分替代固定公式。在工业级数据上,它显著提升了多个指标的平衡表现,且计算量没增加。这不是你明天能直接用的工具,但它指向了推荐系统架构的一个新方向:让模型自己学会什么时候合作、什么时候独立。

📄 原文摘要(英文)

Multi-task learning (MTL) is essential in recommender systems to enable complementary learning among diverse user feedback. While modern industrial practices have shifted from DNNs to Transformer-centric architectures to strengthen sequence modeling and scaling capacity, they still decouple feature encoding from multi-task prediction, treating the Transformer as a task-agnostic encoder. This design fundamentally limits the performance and scalability by (1) creating an information bottleneck under heterogeneous task objectives, (2) inducing gradient interference that leads to the seesaw phenomenon, and (3) forcing a dataflow transition in which attention-based, context-adaptive representation learning is converted to static feed-forward task prediction with incompatible information read-write dynamics. We propose OneRank, a Transformer-native multi-task ranking framework that eliminates encoder-predictor separation and introduces task-private channels for forward representation learning and backward optimization, enabling task-specialized learning while reducing inter-task interference. In the forward pass, OneRank learns task-specific representations bottom-up through task-conditioned information selection, candidate-aware contextualization, and controlled cross-task interaction. In the backward pass, cross-task gradient detachment isolates task-private parameter updates from shared knowledge extraction modules, preventing negative transfer. We further replace static task-specific MLP scorers with dynamic matching-based scoring for context-aware personalized ranking. By internalizing multi-task reasoning within the Transformer stack, OneRank establishes a unified and scalable architectural paradigm. Offline and online experiments on large-scale industrial datasets show that OneRank significantly outperforms state-of-the-art baselines while maintaining computational efficiency.

arXiv 原文

📬 订阅 AI Pulse

每天三次更新,不错过重要信号

▲ 回到顶部