Straightforward dynamic FP8 quant using llmcompressor. Nice for performance on Hopper and Blackwell GPUs.
Tested with nightly vllm on November 25-26, 2025.
- Downloads last month
- 48
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for ramendik/Vistral-24B-Instruct-FP8
Base model
mistralai/Mistral-Small-3.1-24B-Base-2503
Finetuned
Vikhrmodels/Vistral-24B-Instruct