YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Download

Download a specific WebGPU model:

huggingface-cli download webai-community/ai-models --include "ai-models/<MODEL_NAME>/onnx-webgpu/*" --local-dir .

Download all WebGPU models:

huggingface-cli download webai-community/ai-models --include "*/onnx-webgpu/*" --local-dir .

Model List

model name params size gguf model ort webgpu model model info
Phi-4-mini-instruct 3.8B gguf onnx-webgpu README
Phi-4-mini-reasoning 3.8B gguf onnx-webgpu README
Phi-3.5-mini-instruct 3.8B gguf onnx-webgpu README
Phi-3-mini-4k-instruct 3.8B gguf onnx-webgpu README
Phi-3-mini-128k-instruct 3.8B gguf onnx-webgpu README
Qwen3-0.6B 0.6B gguf onnx-webgpu README
Qwen3-1.7B 1.7B gguf onnx-webgpu README
Qwen3-4B 4B gguf onnx-webgpu README
Qwen3-8B 8B gguf onnx-webgpu README
Qwen2.5-0.5B-Instruct 0.5B gguf onnx-webgpu README
Qwen2.5-1.5B-Instruct 1.5B gguf onnx-webgpu README
Qwen2.5-3B-Instruct 3B gguf onnx-webgpu README
Qwen2.5-7B-Instruct 7B gguf onnx-webgpu README
Qwen2-0.5B-Instruct 0.5B gguf onnx-webgpu README
Qwen2-1.5B-Instruct 1.5B gguf onnx-webgpu README
Qwen2-7B-Instruct 7B gguf onnx-webgpu README
DeepSeek-R1-Distill-Qwen-1.5B 1.5B gguf onnx-webgpu README
DeepSeek-R1-Distill-Qwen-7B 7B gguf onnx-webgpu README
DeepSeek-R1-Distill-Llama-8B 8B gguf onnx-webgpu README
DeepSeek-R1-0528-Qwen3-8B 8B gguf onnx-webgpu README
gemma-3-1b-it 1B gguf onnx-webgpu README
gemma-2-2b-it 2B gguf onnx-webgpu README
gemma-2-9b-it 9B gguf onnx-webgpu README
gemma-2b-it 2B gguf onnx-webgpu README
gemma-7b-it 7B gguf onnx-webgpu README
internlm2_5-7b-chat 7B gguf onnx-webgpu README
internlm2-chat-1_8b 1.8B gguf onnx-webgpu README
internlm2-chat-7b 7B gguf onnx-webgpu README
Nemotron-Mini-4B-Instruct 4B gguf onnx-webgpu README
Nemotron-Cascade-8B-Thinking 8B gguf onnx-webgpu README
SmolLM2-1.7B-Instruct 1.7B gguf onnx-webgpu README
SmolLM2-360M-Instruct 360M gguf onnx-webgpu README
SmolLM2-135M-Instruct 135M gguf onnx-webgpu README
SmolLM-1.7B-Instruct 1.7B gguf onnx-webgpu README
SmolLM-360M-Instruct 360M gguf onnx-webgpu README
SmolLM-135M-Instruct 135M gguf onnx-webgpu README
Yi-Coder-1.5B-Chat 1.5B gguf onnx-webgpu README
Qwen2.5-Coder-0.5B-Instruct 0.5B gguf onnx-webgpu README
Qwen2.5-Coder-1.5B-Instruct 1.5B gguf onnx-webgpu README
Qwen2.5-Coder-7B-Instruct 7B gguf onnx-webgpu README
TinyLlama-1.1B-Chat-v1.0 1.1B gguf onnx-webgpu README
CodeLlama-7b-Instruct-hf 7B gguf onnx-webgpu README
SOLAR-10.7B-Instruct-v1.0 10.7B gguf onnx-webgpu README
gpt-oss-20b 20B gguf onnx-webgpu README
granite-3.1-2b-instruct 2B gguf onnx-webgpu README
granite-3.1-8b-instruct 8B gguf onnx-webgpu README
granite-3.2-2b-instruct 2B gguf onnx-webgpu README
granite-3.2-8b-instruct 8B gguf onnx-webgpu README
granite-3.3-2b-instruct 2B gguf onnx-webgpu README
granite-3.3-8b-instruct 8B gguf onnx-webgpu README
Ministral-8B-Instruct-2410 8B gguf onnx-webgpu README
Mistral-7B-Instruct-v0.2 7B gguf onnx-webgpu README
Mistral-7B-Instruct-v0.3 7B gguf onnx-webgpu README
Mistral-Nemo-Instruct-2407 12B gguf onnx-webgpu README
Yi-1.5-6B-Chat 6B gguf onnx-webgpu README
Yi-1.5-9B-Chat 9B gguf onnx-webgpu README
Downloads last month
-
GGUF
Model size
7B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support