📄 论文解读

AI记忆体：能记住，但管不住秘密

趋势通道 ▲ 13 AI记忆权限控制遗忘多用户共享助手

你让AI助手记住你的病历、同事的日程、孩子的作业，它确实记住了——但可能记混了。现有AI记忆测试只测单人场景，而医院、公司、家庭里多人共用同一个AI记忆库时，问题就来了：谁该看到什么？谁要求删除后真的被忘了？这篇论文造了一个新测试，覆盖医疗、办公、教育、家庭四个场景，让AI同时面对长期记忆、权限控制、主动遗忘三个任务。结果：没有一种方法能同时做好。长上下文提示词效果最好但贵得离谱；检索式和外挂记忆省钱，但会泄露不该说的信息。结论很诚实：现在的AI记忆体，离安全共享还差得远。

📄 原文摘要(英文)

Memory benchmarks for LLM agents largely assume single-user settings, leaving shared assistants for hospitals, workplaces, campuses, and households understudied. In these deployments, multiple principals write to a common memory pool and query it under different roles, scopes, and relationships, so memory quality requires governance as well as recall. We introduce GateMem, a benchmark for multi-principal shared-memory agents. GateMem jointly evaluates utility for legitimate long-horizon requests with state updates, access control across contextual authorization boundaries, and agent-facing active forgetting after explicit deletion requests. It spans medical, office, education, and household domains, with long-form multi-party episodes, incremental memory injection, hidden checkpoints, structured judging, and leak-target annotations. Across diverse baselines and backbone models, no method simultaneously achieves strong utility, robust access control, and reliable forgetting. Long-context prompting often yields the best governance score at high token cost, while retrieval-based and external-memory methods reduce cost yet still leak unauthorized or deleted information. These results show current memory agents remain far from reliable shared institutional deployment.

arXiv 原文

📬 订阅 AI Pulse