Not All Disagreement Is Learnable: Token Teachability in On-Policy Distillation Paper • 2605.26844 • Published 15 days ago • 26 • 2
Not All Disagreement Is Learnable: Token Teachability in On-Policy Distillation Paper • 2605.26844 • Published 15 days ago • 26
Model Merging Scaling Laws in Large Language Models Paper • 2509.24244 • Published about 1 month ago • 44
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training Paper • 2605.09608 • Published May 10 • 52 • 3
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training Paper • 2605.09608 • Published May 10 • 52
An Efficient Graph-Transformer Operator for Learning Physical Dynamics with Manifolds Embedding Paper • 2512.10227 • Published Dec 11, 2025 • 2
An Efficient Graph-Transformer Operator for Learning Physical Dynamics with Manifolds Embedding Paper • 2512.10227 • Published Dec 11, 2025 • 2 • 1
An Efficient Graph-Transformer Operator for Learning Physical Dynamics with Manifolds Embedding Paper • 2512.10227 • Published Dec 11, 2025 • 2
InfiMed-ORBIT: Aligning LLMs on Open-Ended Complex Tasks via Rubric-Based Incremental Training Paper • 2510.15859 • Published Oct 17, 2025 • 13