Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published 7 days ago • 151
A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory Paper • 2510.02373 • Published Sep 29, 2025 • 10
CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics Paper • 2508.18124 • Published Aug 25, 2025 • 49