FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning Paper • 2601.18150 • Published 4 days ago • 4
Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning Paper • 2601.20829 • Published 2 days ago • 4
Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning Paper • 2601.19280 • Published 3 days ago • 7
OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution Paper • 2601.20380 • Published 2 days ago • 7
Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning Paper • 2601.20209 • Published 3 days ago • 18
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery Paper • 2601.19325 • Published 3 days ago • 67
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published 2 days ago • 109
Benchmarks Saturate When The Model Gets Smarter Than The Judge Paper • 2601.19532 • Published 3 days ago • 2
HyperAlign: Hypernetwork for Efficient Test-Time Alignment of Diffusion Models Paper • 2601.15968 • Published 8 days ago • 5
Towards Pixel-Level VLM Perception via Simple Points Prediction Paper • 2601.19228 • Published 3 days ago • 14
Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models Paper • 2601.19834 • Published 3 days ago • 24
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision Paper • 2601.19798 • Published 3 days ago • 38
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning Paper • 2601.18631 • Published 4 days ago • 46
Yunjue Agent Tech Report: A Fully Reproducible, Zero-Start In-Situ Self-Evolving Agent System for Open-Ended Tasks Paper • 2601.18226 • Published 4 days ago • 5