Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding Paper • 2605.29707 • Published 12 days ago • 142
Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention Paper • 2605.22791 • Published 19 days ago • 31
Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models Paper • 2605.11887 • Published 28 days ago • 11
WebWorld: A Large-Scale World Model for Web Agent Training Paper • 2602.14721 • Published Feb 16 • 19