Useful Memories Become Faulty When Continuously Updated by LLMs Paper • 2605.12978 • Published 4 days ago • 18
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published 6 days ago • 72
Heterogeneous Scientific Foundation Model Collaboration Paper • 2604.27351 • Published 17 days ago • 213
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper • 2505.24863 • Published May 30, 2025 • 97
SLMRec: Distilling Large Language Models into Small for Sequential Recommendation Paper • 2405.17890 • Published May 28, 2024
iAgent: LLM Agent as a Shield between User and Recommender Systems Paper • 2502.14662 • Published Feb 20, 2025
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory Paper • 2511.20857 • Published Nov 25, 2025 • 3
Mem-Gallery: Benchmarking Multimodal Long-Term Conversational Memory for MLLM Agents Paper • 2601.03515 • Published Jan 7 • 2
Generalizing Test-time Compute-optimal Scaling as an Optimizable Graph Paper • 2511.00086 • Published Oct 29, 2025 • 42
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published Sep 26, 2025 • 137