Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF Text Generation • 27B • Updated 4 days ago • 90.9k • 166
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Text Generation • 28B • Updated 4 days ago • 30.8k • 394
DynaMoE: Dynamic Token-Level Expert Activation with Layer-Wise Adaptive Capacity for Mixture-of-Experts Neural Networks Paper • 2603.01697 • Published 9 days ago • 2