AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 12 days ago • 185
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 18 days ago • 269
RoboMemArena: A Comprehensive and Challenging Robotic Memory Benchmark Paper • 2605.10921 • Published 20 days ago • 4
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation Paper • 2605.10912 • Published 20 days ago • 46
D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models Paper • 2605.05204 • Published 25 days ago • 27
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 28 days ago • 166
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published Apr 29 • 108