Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models Paper • 2604.02340 • Published Apr 11 • 9
ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression Paper • 2602.11008 • Published Feb 11 • 18
view article Article Smol2Operator: Post-Training GUI Agents for Computer Use +3 A-Mahla, merve, sergiopaniego, reach-vb, lewtun • Sep 23, 2025 • 138
Risk-Averse Reinforcement Learning with Itakura-Saito Loss Paper • 2505.16925 • Published May 22, 2025 • 26
ZhMax/llama-2-13b-ebft-sparsegpt-outlier-wiki-block-outlier Text Generation • 13B • Updated Dec 19, 2024 • 2
ZhMax/llama-2-7b-ebft-sparsegpt-outlier-wiki-block-outlier Text Generation • 7B • Updated Dec 15, 2024 • 3
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs Paper • 2408.15300 • Published Aug 27, 2024 • 3