OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification Paper • 2606.01476 • Published 8 days ago • 8
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 12 days ago • 89
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 12 days ago • 420