IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction Paper • 2511.07327 • Published Nov 10, 2025 • 76
Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models Paper • 2509.06949 • Published Sep 8, 2025 • 55
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4, 2025 • 133
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving Paper • 2507.23726 • Published Jul 31, 2025 • 114
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17, 2025 • 259
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17, 2025 • 45
STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for Spoken Language Models Paper • 2507.15375 • Published Jul 21, 2025 • 30
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning Paper • 2507.17512 • Published Jul 23, 2025 • 36
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper • 2507.14683 • Published Jul 19, 2025 • 134
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Paper • 2506.18896 • Published Jun 23, 2025 • 29
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published Jul 1, 2025 • 79
AceReason Collection Math and Code reasoning model trained through reinforcement learning (RL) • 7 items • Updated 14 days ago • 20
AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy Paper • 2506.13284 • Published Jun 16, 2025 • 26