INC4AI commited on
Commit
5339741
·
verified ·
1 Parent(s): 7d3ff0b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -6,8 +6,8 @@ base_model:
6
 
7
  ## Model Details
8
 
9
- This model is an mxfp4 quantized version of [Qwen/Qwen3-235B-A22B](https://huggingface.co/Qwen/Qwen3-235B-A22B) generated by [intel/auto-round](https://github.com/intel/auto-round).
10
- The model is not able to be published due to the storage limitation. Please follow the INC example README to generate and evaluate the low precision model.
11
 
12
  ## How to Use
13
 
@@ -15,13 +15,13 @@ The step-by-step README of quantization and evaluation can be found in [Intel Ne
15
 
16
  ## Evaluate Results
17
 
18
- | Task | backend | BF16 | MXFP4 |
19
- |:---------:|:-------:|:------:|:------:|
20
- | hellaswag | vllm | 0.6794 | 0.6680 |
21
- | piqa | vllm | 0.8177 | 0.8161 |
22
- | mmlu | vllm | 0.8492 | 0.8435 |
23
- | gsm8k | vllm | 0.9242 | 0.9363 |
24
- | average | vllm | 0.8176 | 0.8160 |
25
 
26
  ## Ethical Considerations and Limitations
27
 
 
6
 
7
  ## Model Details
8
 
9
+ This model card is for mxfp4/mxfp8 quantization of [Qwen/Qwen3-235B-A22B](https://huggingface.co/Qwen/Qwen3-235B-A22B) based on [intel/auto-round](https://github.com/intel/auto-round).
10
+ The models are not able to be published due to the storage limitation. Please follow the INC example README to generate and evaluate the low precision models.
11
 
12
  ## How to Use
13
 
 
15
 
16
  ## Evaluate Results
17
 
18
+ | Task | backend | BF16 | MXFP4 | MXFP8 |
19
+ |:-----------:|:-------:|:----------:|:----------:|:----------:|
20
+ | hellaswag | vllm | 0.6794 | 0.6680 | 0.6768 |
21
+ | piqa | vllm | 0.8177 | 0.8161 | 0.8221 |
22
+ | mmlu | vllm | 0.8492 | 0.8435 | 0.8472 |
23
+ | gsm8k | vllm | 0.9242 | 0.9363 | 0.9325 |
24
+ | **average** | vllm | **0.8176** | **0.8160** | **0.8196** |
25
 
26
  ## Ethical Considerations and Limitations
27