DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published 14 days ago • 61
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 21 days ago • 117
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 20 days ago • 242
ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement Paper • 2604.01591 • Published 27 days ago • 42
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published 27 days ago • 496