InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published 16 days ago • 306
FineRMoE: Dimension Expansion for Finer-Grained Expert with Its Upcycling Approach Paper • 2603.13364 • Published 24 days ago • 9
The Curse and Blessing of Mean Bias in FP4-Quantized LLM Training Paper • 2603.10444 • Published 22 days ago • 10
Mixture of Attention Heads: Selecting Attention Heads Per Token Paper • 2210.05144 • Published Oct 11, 2022 • 3
MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling Paper • 2602.03359 • Published Feb 3 • 10
MemoryLLM: Plug-n-Play Interpretable Feed-Forward Memory for Transformers Paper • 2602.00398 • Published Jan 30 • 6
Decoding as Optimisation on the Probability Simplex: From Top-K to Top-P (Nucleus) to Best-of-K Samplers Paper • 2602.18292 • Published Feb 20 • 11
RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs Paper • 2602.05367 • Published Feb 5 • 7
POP: Prefill-Only Pruning for Efficient Large Model Inference Paper • 2602.03295 • Published Feb 3 • 4
Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {pm 1, pm i} Paper • 2512.02901 • Published Dec 2, 2025 • 6