view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective about 1 month ago • 57
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective about 1 month ago • 57
Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs Paper • 2509.25779 • Published Sep 30, 2025 • 19
Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs Paper • 2509.25779 • Published Sep 30, 2025 • 19