Post
161
š vLLM Docker Image for NVIDIA DGX Spark (GB10/SM121)
Just released a pre-built vLLM Docker image optimized for DGX Spark's ARM64 + Blackwell SM121 GPU.
**Why this exists:**
Standard vLLM images don't support SM121 - you get "SM121 not supported" errors. This image includes patches for full GB10 compatibility.
**What's included:**
- vLLM 0.15.0 + SM121 patches
- PyTorch 2.11 + CUDA 13.0
- ARM64 (aarch64) native
- Pre-configured for FlashInfer attention
**Verified models:**
- Qwen3-Next-80B-A3B-FP8 (1M context!)
- Qwen3-Embedding-8B (4096-dim embeddings)
- Qwen3-VL-30B (vision)
docker pull
https://hub.docker.com/r/hellohal2064/vllm-dgx-spark-gb10
Just released a pre-built vLLM Docker image optimized for DGX Spark's ARM64 + Blackwell SM121 GPU.
**Why this exists:**
Standard vLLM images don't support SM121 - you get "SM121 not supported" errors. This image includes patches for full GB10 compatibility.
**What's included:**
- vLLM 0.15.0 + SM121 patches
- PyTorch 2.11 + CUDA 13.0
- ARM64 (aarch64) native
- Pre-configured for FlashInfer attention
**Verified models:**
- Qwen3-Next-80B-A3B-FP8 (1M context!)
- Qwen3-Embedding-8B (4096-dim embeddings)
- Qwen3-VL-30B (vision)
docker pull
https://hub.docker.com/r/hellohal2064/vllm-dgx-spark-gb10