Papers
arxiv:2606.00619

MemPro: Agentic Memory Systems as Evolvable Programs

Published on May 30
Authors:
,
,
,
,
,
,
,

Abstract

MemPro is a system-level evolution framework that treats the entire memory construction-retrieval pipeline as an evolvable program, enabling iterative improvement through failure-mode-guided refinement and outperforming static baselines in long-horizon autonomous agent tasks.

Long-horizon autonomous agents require memory systems to retain historical information, track evolving states, and reuse relevant knowledge beyond finite context windows. Existing agentic memory systems typically follow a memory construction-retrieval (MCR) pipeline, but often adapt mainly the memory bank while keeping the surrounding pipeline fixed after deployment. This fixed-pipeline design struggles to handle heterogeneous task-specific failure modes and can become misaligned with memory banks that evolve in scale and structure over time. To address these limitations, we propose MemPro, a system-level evolution framework that treats the entire MCR pipeline as an evolvable program rather than adapting only the memory bank or prompt text. MemPro maintains a version tree of runnable memory-system implementations, where an Evolving Agent iteratively selects promising versions, diagnoses recurring failures, and creates improved child versions through failure-mode-guided edit-debug refinement. Experiments on LongMemEval, LoCoMo, HotpotQA, and NarrativeQA show that MemPro consistently outperforms strong static and prompt-level evolving baselines within a few iterations, continues to improve with evolution, and achieves a favorable performance-cost trade-off. Code is available at https://github.com/wanghai673/MemPro.

Community

MemPro proposes a novel system-level evolution framework that treats the entire memory construction–retrieval (MCR) pipeline as an evolvable program. Unlike prior work that only adapts the memory bank or prompt text, MemPro maintains a version tree of runnable pipeline implementations, where an Evolving Agent iteratively:

  • 🔍 Selects the most promising pipeline version based on evaluation logs
  • 🛠️ Expands it via failure-mode-guided edit–debug refinement (both prompts + executable code)
  • 📊 Evaluates the new version on a held-out training set

This addresses two key limitations of fixed-pipeline memory systems: task heterogeneity (different tasks need different memory strategies) and memory–pipeline misalignment (the pipeline becomes misaligned as the memory bank evolves).

Experiments on LongMemEval, LoCoMo, HotpotQA, and NarrativeQA show MemPro consistently outperforms strong static and prompt-level evolving baselines (including GEPA and MetaMem) within just 5 iterations, and continues to improve as evolution progresses — achieving state-of-the-art with a favorable performance–cost trade-off.

Code: https://github.com/wanghai673/MemPro

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.00619
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.00619 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.00619 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.00619 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.