Openhands Trajectories Collection Dataset of 67,074 OpenHands trajectories collected with Qwen3-Coder-480B-A35B-Instruct and two RFT checkpoints trained on the data • 3 items • Updated 14 days ago • 6
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 Text Generation • 32B • Updated about 16 hours ago • 272k • 529
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 282
view article Article Scaling Test-Time Compute to Achieve Gold Medal at IOI 2025 with Open-Weight Models Oct 20, 2025 • 20
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29, 2025 • 141
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 180
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning Paper • 2509.07980 • Published Sep 9, 2025 • 101
Towards a Unified View of Large Language Model Post-Training Paper • 2509.04419 • Published Sep 4, 2025 • 75
view article Article Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B Aug 18, 2025 • 31