nm-testing/DeepSeek-V2-Lite-FP8-BLOCK-Fused
16B
•
Updated
•
29
0.6B
•
Updated
•
17
nm-testing/tinysmokeqwen3
2.64M
•
Updated
•
195
nm-testing/Meta-Llama-3-8B-Instruct-NVFP4A16-GPTQ
5B
•
Updated
•
5
nm-testing/Meta-Llama-3-8B-Instruct-NVFP4-GPTQ-ActOrder
5B
•
Updated
•
19
nm-testing/Meta-Llama-3-8B-Instruct-NVFP4-GPTQ
5B
•
Updated
•
23
nm-testing/Meta-Llama-3-8B-Instruct-NVFP4
5B
•
Updated
•
10
nm-testing/Meta-Llama-3-8B-Instruct-MXFP4A16-GPTQ
5B
•
Updated
•
9
nm-testing/Speculator-Qwen3-30B-MOE-VL-Eagle3
0.4B
•
Updated
•
164
nm-testing/Qwen3-0.6B-FP8_BLOCK
0.6B
•
Updated
•
66
nm-testing/Qwen3-0.6B-W4A16-G128
0.2B
•
Updated
•
234
nm-testing/Llama-3.2-1B-Instruct-DEBUG-STRAWBERRY
1B
•
Updated
•
17
nm-testing/Llama-3.2-1B-Instruct-DEBUG-COUNTER
1B
•
Updated
•
46
nm-testing/TinyLlama-1.1B-compressed-tensors-kv-cache-scheme
Text Generation
•
0.4B
•
Updated
•
1.14k
nm-testing/TinyLlama-1.1B-Chat-v1.0-kvcache-fp8-attn_head
1B
•
Updated
•
57
nm-testing/TinyLlama-1.1B-Chat-v1.0-kvcache-fp8-tensor
1B
•
Updated
•
1.35k
nm-testing/Qwen3-30B-A3B-MXFP4A16
17B
•
Updated
•
5.12k
nm-testing/Qwen3-32B-MXFP4A16
nm-testing/Meta-Llama-3-8B-Instruct-awq-NVFP4
nm-testing/testing-llama3.1.8b-2layer-eagle3
nm-testing/Qwen3-30B-A3B-NVFP416
17B
•
Updated
•
38
nm-testing/CDH-test-nvfp4-awq
5B
•
Updated
nm-testing/granite-4.0-h-small-FP8-dynamic
Text Generation
•
32B
•
Updated
•
4
nm-testing/tinysmokeqwen3moe-W4A16-first-only-CTstable
2.54M
•
Updated
•
2.34k
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated