Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models Paper • 2602.12036 • Published 4 days ago • 89
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners Paper • 2509.26226 • Published Sep 30, 2025 • 34
Running 1.49k Big Code Models Leaderboard 📈 1.49k Explore and submit code model evaluations on a leaderboard