AI Pulse
📄 论文解读

AI智能体的记忆系统:没有万能方案

AI智能体(能自主执行任务的AI)的记忆系统已经不再是简单的“查资料”,而是进化成了类似数据库的复杂系统,需要管理信息的存储、更新、整合和生命周期。但现有评测只看最终任务完成度,把记忆系统当黑箱。这篇论文拆解了记忆系统的四个核心模块(存储、提取、检索、维护),测试了12种系统,发现没有一种架构在所有场景下都最优——效果取决于记忆结构是否匹配任务瓶颈。比如,有的系统擅长长期记忆但更新成本高,有的检索快但容易遗忘。他们还发现,局部维护比全局重组更省钱。结论是:别指望一个万能记忆系统,得根据任务选。

📄 原文摘要(英文)

Memory for large language model (LLM) agents has rapidly evolved from simple retrieval-augmented mechanisms into a data management system that supports persistent information storage, retrieval, update, consolidation, and dynamic lifecycle governance throughout agent execution. Despite this evolution, existing evaluations still benchmark agent memory mainly through end-to-end task success metrics (e.g., F1, BLEU), while treating the underlying system as a monolithic black box. As a result, critical system-level concerns, including operational costs, architectural trade-offs across memory modules, and robustness under dynamic knowledge updates, remain insufficiently explored. In this paper, we present a systematic experimental study of agent memory from a data management perspective. We propose an analytical framework that decomposes agent memory into four core modules: memory representation and storage, extraction, retrieval and routing, and maintenance. Under this framework, we evaluate 12 representative memory systems and two reference baselines across five benchmark workloads spanning 11 datasets. Our extensive end-to-end evaluation shows that no single architecture dominates across all scenarios; instead, effectiveness depends heavily on how well the memory structure aligns with the workload bottleneck. Furthermore, through fine-grained ablation studies, we quantify their individual effects on representation fidelity, retrieval precision, update correctness, and long-horizon stability. Finally, we reveal cost-performance trade-offs under realistic workloads, showing localized maintenance is more cost-efficient than global reorganization. Based on these findings, we identify promising directions towards building truly agent-native memory systems. The code is publicly available at https://github.com/OpenDataBox/MemoryData.

arXiv 原文

📬 订阅 AI Pulse

每天三次更新,不错过重要信号

▲ 回到顶部