Running 3.86k The Ultra-Scale Playbook 🌌 3.86k The ultimate guide to training LLM on large GPU Clusters
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving Paper • 2401.09670 • Published Jan 18, 2024 • 2
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 526k • 1.6k
LLaVA-CoT: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Jul 21, 2025 • 131
QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF Text Generation • 3B • Updated Nov 2, 2024 • 389 • 10
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published Jan 9, 2025 • 104
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28, 2025 • 125
Running on Zero Agents Featured 115 Llama3.1 S V0.2 Checkpoint 2024 08 20 😻 115 Chat with Llama3.1 using spoken audio or synthesize speech
shenzhi-wang/Llama3.1-8B-Chinese-Chat Text Generation • 8B • Updated Jul 29, 2024 • 8.49k • • 265
Runtime error Agents Featured 219 Microsoft Phi-3-Vision-128k 😻 219 Chat with an image using Phi-3 Vision model