DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published 13 days ago • 61
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 20 days ago • 116
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 19 days ago • 241
ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement Paper • 2604.01591 • Published 26 days ago • 42
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published 26 days ago • 495