Running 99 Unlocking On-Policy Distillation for Any Model Family ๐ 99 Visualize on-policy distillation for any model family
view article Article Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models Nov 19, 2025 โข 34
Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents Paper โข 2510.14967 โข Published Oct 16, 2025 โข 34