Upload folder using huggingface_hub
Browse filesThis view is limited to 50 files because it contains too many changes.
See raw diff
- .gitattributes +2 -0
- README.md +141 -0
- chat_template.jinja +86 -0
- config.json +78 -0
- generation_config.json +10 -0
- hf_quant_config.json +14 -0
- model-00001-of-00041.safetensors +3 -0
- model-00002-of-00041.safetensors +3 -0
- model-00003-of-00041.safetensors +3 -0
- model-00004-of-00041.safetensors +3 -0
- model-00005-of-00041.safetensors +3 -0
- model-00006-of-00041.safetensors +3 -0
- model-00007-of-00041.safetensors +3 -0
- model-00008-of-00041.safetensors +3 -0
- model-00009-of-00041.safetensors +3 -0
- model-00010-of-00041.safetensors +3 -0
- model-00011-of-00041.safetensors +3 -0
- model-00012-of-00041.safetensors +3 -0
- model-00013-of-00041.safetensors +3 -0
- model-00014-of-00041.safetensors +3 -0
- model-00015-of-00041.safetensors +3 -0
- model-00016-of-00041.safetensors +3 -0
- model-00017-of-00041.safetensors +3 -0
- model-00018-of-00041.safetensors +3 -0
- model-00019-of-00041.safetensors +3 -0
- model-00020-of-00041.safetensors +3 -0
- model-00021-of-00041.safetensors +3 -0
- model-00022-of-00041.safetensors +3 -0
- model-00023-of-00041.safetensors +3 -0
- model-00024-of-00041.safetensors +3 -0
- model-00025-of-00041.safetensors +3 -0
- model-00026-of-00041.safetensors +3 -0
- model-00027-of-00041.safetensors +3 -0
- model-00028-of-00041.safetensors +3 -0
- model-00029-of-00041.safetensors +3 -0
- model-00030-of-00041.safetensors +3 -0
- model-00031-of-00041.safetensors +3 -0
- model-00032-of-00041.safetensors +3 -0
- model-00033-of-00041.safetensors +3 -0
- model-00034-of-00041.safetensors +3 -0
- model-00035-of-00041.safetensors +3 -0
- model-00036-of-00041.safetensors +3 -0
- model-00037-of-00041.safetensors +3 -0
- model-00038-of-00041.safetensors +3 -0
- model-00039-of-00041.safetensors +3 -0
- model-00040-of-00041.safetensors +3 -0
- model-00041-of-00041.safetensors +3 -0
- model.safetensors.index.json +3 -0
- special_tokens_map.json +34 -0
- tokenizer.json +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
model.safetensors.index.json filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,141 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
datasets:
|
| 3 |
+
- abisee/cnn_dailymail
|
| 4 |
+
- nvidia/Nemotron-Post-Training-Dataset-v2
|
| 5 |
+
base_model:
|
| 6 |
+
- zai-org/GLM-4.7
|
| 7 |
+
base_model_relation: quantized
|
| 8 |
+
license: mit
|
| 9 |
+
pipeline_tag: text-generation
|
| 10 |
+
---
|
| 11 |
+
# GLM-4.7-NVFP4
|
| 12 |
+
|
| 13 |
+
**Format:** NVFP4 — optimal partial quantization of weights & activations to NVFP4.
|
| 14 |
+
**Base model:** `zai-org/GLM-4.7`
|
| 15 |
+
**How it was made:** [AutoQuantized](https://nvidia.github.io/Model-Optimizer/guides/_pytorch_quantization.html#optimal-partial-quantization-using-auto-quantize) with [NVIDIA Model-Optimizer](https://github.com/NVIDIA/Model-Optimizer/) (NVFP4) with x8 RTX PRO 6000s, using the default calibration mix. ([cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) and [nemotron-post-training-dataset-v2](https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v2))
|
| 16 |
+
|
| 17 |
+
Check the [original model card](https://huggingface.co/zai-org/GLM-4.7) for information about this model.
|
| 18 |
+
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
### **MMLU Benchmark Results: Salyut1/GLM-4.7-NVFP4**
|
| 22 |
+
#### **Summary Table**
|
| 23 |
+
| Groups | Version | Metric | Value | Stderr |
|
| 24 |
+
| --- | --- | --- | --- | --- |
|
| 25 |
+
| **MMLU (Total)** | 2 | acc ↑ | **0.8348** | ± 0.0030 |
|
| 26 |
+
| **Social Sciences** | 2 | acc ↑ | **0.9051** | ± 0.0052 |
|
| 27 |
+
| **Other** | 2 | acc ↑ | **0.8684** | ± 0.0058 |
|
| 28 |
+
| **STEM** | 2 | acc ↑ | **0.8351** | ± 0.0064 |
|
| 29 |
+
| **Humanities** | 2 | acc ↑ | **0.7664** | ± 0.0059 |
|
| 30 |
+
#### **STEM**
|
| 31 |
+
| Tasks | n-shot | Metric | Value | Stderr |
|
| 32 |
+
| --- | --- | --- | --- | --- |
|
| 33 |
+
| High School Biology | 0 | acc ↑ | 0.9516 | ± 0.0122 |
|
| 34 |
+
| College Biology | 0 | acc ↑ | 0.9514 | ± 0.0180 |
|
| 35 |
+
| Astronomy | 0 | acc ↑ | 0.9474 | ± 0.0182 |
|
| 36 |
+
| High School Computer Science | 0 | acc ↑ | 0.9300 | ± 0.0256 |
|
| 37 |
+
| Conceptual Physics | 0 | acc ↑ | 0.9064 | ± 0.0190 |
|
| 38 |
+
| Elementary Mathematics | 0 | acc ↑ | 0.8862 | ± 0.0164 |
|
| 39 |
+
| Electrical Engineering | 0 | acc ↑ | 0.8690 | ± 0.0281 |
|
| 40 |
+
| High School Statistics | 0 | acc ↑ | 0.8565 | ± 0.0239 |
|
| 41 |
+
| College Computer Science | 0 | acc ↑ | 0.8400 | ± 0.0368 |
|
| 42 |
+
| Anatomy | 0 | acc ↑ | 0.8296 | ± 0.0325 |
|
| 43 |
+
| High School Physics | 0 | acc ↑ | 0.7947 | ± 0.0330 |
|
| 44 |
+
| High School Chemistry | 0 | acc ↑ | 0.7882 | ± 0.0287 |
|
| 45 |
+
| Machine Learning | 0 | acc ↑ | 0.7679 | ± 0.0401 |
|
| 46 |
+
| College Physics | 0 | acc ↑ | 0.7647 | ± 0.0422 |
|
| 47 |
+
| Abstract Algebra | 0 | acc ↑ | 0.6800 | ± 0.0469 |
|
| 48 |
+
| College Chemistry | 0 | acc ↑ | 0.6800 | ± 0.0469 |
|
| 49 |
+
| College Mathematics | 0 | acc ↑ | 0.6800 | ± 0.0469 |
|
| 50 |
+
| High School Mathematics | 0 | acc ↑ | 0.6481 | ± 0.0291 |
|
| 51 |
+
#### **Social Sciences**
|
| 52 |
+
| Tasks | n-shot | Metric | Value | Stderr |
|
| 53 |
+
| --- | --- | --- | --- | --- |
|
| 54 |
+
| High School Government/Politics | 0 | acc ↑ | 0.9793 | ± 0.0103 |
|
| 55 |
+
| High School Microeconomics | 0 | acc ↑ | 0.9706 | ± 0.0110 |
|
| 56 |
+
| High School Psychology | 0 | acc ↑ | 0.9523 | ± 0.0091 |
|
| 57 |
+
| Human Sexuality | 0 | acc ↑ | 0.9313 | ± 0.0222 |
|
| 58 |
+
| Sociology | 0 | acc ↑ | 0.9204 | ± 0.0191 |
|
| 59 |
+
| High School Geography | 0 | acc ↑ | 0.9192 | ± 0.0194 |
|
| 60 |
+
| High School Macroeconomics | 0 | acc ↑ | 0.9000 | ± 0.0152 |
|
| 61 |
+
| US Foreign Policy | 0 | acc ↑ | 0.9000 | ± 0.0302 |
|
| 62 |
+
| Professional Psychology | 0 | acc ↑ | 0.8725 | ± 0.0135 |
|
| 63 |
+
| Security Studies | 0 | acc ↑ | 0.8653 | ± 0.0219 |
|
| 64 |
+
| Public Relations | 0 | acc ↑ | 0.7636 | ± 0.0407 |
|
| 65 |
+
| Econometrics | 0 | acc ↑ | 0.7544 | ± 0.0405 |
|
| 66 |
+
#### **Humanities**
|
| 67 |
+
| Tasks | n-shot | Metric | Value | Stderr |
|
| 68 |
+
| --- | --- | --- | --- | --- |
|
| 69 |
+
| High School US History | 0 | acc ↑ | 0.9461 | ± 0.0159 |
|
| 70 |
+
| High School World History | 0 | acc ↑ | 0.9367 | ± 0.0158 |
|
| 71 |
+
| World Religions | 0 | acc ↑ | 0.9064 | ± 0.0223 |
|
| 72 |
+
| Prehistory | 0 | acc ↑ | 0.8981 | ± 0.0168 |
|
| 73 |
+
| International Law | 0 | acc ↑ | 0.8926 | ± 0.0283 |
|
| 74 |
+
| Jurisprudence | 0 | acc ↑ | 0.8889 | ± 0.0304 |
|
| 75 |
+
| Logical Fallacies | 0 | acc ↑ | 0.8834 | ± 0.0252 |
|
| 76 |
+
| High School European History | 0 | acc ↑ | 0.8788 | ± 0.0255 |
|
| 77 |
+
| Moral Disputes | 0 | acc ↑ | 0.8699 | ± 0.0181 |
|
| 78 |
+
| Philosophy | 0 | acc ↑ | 0.8617 | ± 0.0196 |
|
| 79 |
+
| Formal Logic | 0 | acc ↑ | 0.7460 | ± 0.0389 |
|
| 80 |
+
| Professional Law | 0 | acc ↑ | 0.6610 | ± 0.0121 |
|
| 81 |
+
| Moral Scenarios | 0 | acc ↑ | 0.6425 | ± 0.0160 |
|
| 82 |
+
#### **Other**
|
| 83 |
+
| Tasks | n-shot | Metric | Value | Stderr |
|
| 84 |
+
| --- | --- | --- | --- | --- |
|
| 85 |
+
| Medical Genetics | 0 | acc ↑ | 0.9800 | ± 0.0141 |
|
| 86 |
+
| Marketing | 0 | acc ↑ | 0.9530 | ± 0.0139 |
|
| 87 |
+
| Miscellaneous | 0 | acc ↑ | 0.9374 | ± 0.0087 |
|
| 88 |
+
| Professional Medicine | 0 | acc ↑ | 0.9301 | ± 0.0155 |
|
| 89 |
+
| Clinical Knowledge | 0 | acc ↑ | 0.9057 | ± 0.0180 |
|
| 90 |
+
| Nutrition | 0 | acc ↑ | 0.9052 | ± 0.0168 |
|
| 91 |
+
| Management | 0 | acc ↑ | 0.8932 | ± 0.0306 |
|
| 92 |
+
| Business Ethics | 0 | acc ↑ | 0.8600 | ± 0.0349 |
|
| 93 |
+
| Computer Security | 0 | acc ↑ | 0.8600 | ± 0.0349 |
|
| 94 |
+
| Human Aging | 0 | acc ↑ | 0.8161 | ± 0.0260 |
|
| 95 |
+
| College Medicine | 0 | acc ↑ | 0.7977 | ± 0.0306 |
|
| 96 |
+
| Professional Accounting | 0 | acc ↑ | 0.7624 | ± 0.0254 |
|
| 97 |
+
| Global Facts | 0 | acc ↑ | 0.6500 | ± 0.0479 |
|
| 98 |
+
| Virology | 0 | acc ↑ | 0.5723 | ± 0.0385 |
|
| 99 |
+
|
| 100 |
+
---
|
| 101 |
+
|
| 102 |
+
vLLM Inference Note:
|
| 103 |
+
|
| 104 |
+
I needed to patch `vllm/model_executor/models/glm4_moe.py` to skip specific k_scale and v_scale parameters if they are missing from the checkpoint, rather than crashing. The below script fixed my k_scale and v_scale errors.
|
| 105 |
+
```python
|
| 106 |
+
import sys
|
| 107 |
+
import os
|
| 108 |
+
import re
|
| 109 |
+
|
| 110 |
+
# Path to the vLLM model file
|
| 111 |
+
path = '/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/glm4_moe.py'
|
| 112 |
+
|
| 113 |
+
if os.path.exists(path):
|
| 114 |
+
with open(path, 'r') as f:
|
| 115 |
+
lines = f.readlines()
|
| 116 |
+
|
| 117 |
+
target_str = 'param = params_dict[name]'
|
| 118 |
+
new_lines = []
|
| 119 |
+
patched = False
|
| 120 |
+
|
| 121 |
+
for line in lines:
|
| 122 |
+
# We look for the parameter loading line
|
| 123 |
+
if target_str in line and 'k_scale' not in line:
|
| 124 |
+
whitespace = re.match(r'^(\s*)', line).group(1)
|
| 125 |
+
|
| 126 |
+
# Inject logic: If asking for k_scale/v_scale and it's missing, skip
|
| 127 |
+
payload = f"{whitespace}if ('k_scale' in name or 'v_scale' in name) and name not in params_dict: continue\n"
|
| 128 |
+
|
| 129 |
+
new_lines.append(payload)
|
| 130 |
+
new_lines.append(line)
|
| 131 |
+
patched = True
|
| 132 |
+
else:
|
| 133 |
+
new_lines.append(line)
|
| 134 |
+
|
| 135 |
+
if patched:
|
| 136 |
+
with open(path, 'w') as f:
|
| 137 |
+
f.writelines(new_lines)
|
| 138 |
+
print(f"Successfully patched {path}")
|
| 139 |
+
else:
|
| 140 |
+
print("File already patched or target not found.")
|
| 141 |
+
```
|
chat_template.jinja
ADDED
|
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[gMASK]<sop>
|
| 2 |
+
{%- if tools -%}
|
| 3 |
+
<|system|>
|
| 4 |
+
# Tools
|
| 5 |
+
|
| 6 |
+
You may call one or more functions to assist with the user query.
|
| 7 |
+
|
| 8 |
+
You are provided with function signatures within <tools></tools> XML tags:
|
| 9 |
+
<tools>
|
| 10 |
+
{% for tool in tools %}
|
| 11 |
+
{{ tool | tojson(ensure_ascii=False) }}
|
| 12 |
+
{% endfor %}
|
| 13 |
+
</tools>
|
| 14 |
+
|
| 15 |
+
For each function call, output the function name and arguments within the following XML format:
|
| 16 |
+
<tool_call>{function-name}<arg_key>{arg-key-1}</arg_key><arg_value>{arg-value-1}</arg_value><arg_key>{arg-key-2}</arg_key><arg_value>{arg-value-2}</arg_value>...</tool_call>{%- endif -%}
|
| 17 |
+
{%- macro visible_text(content) -%}
|
| 18 |
+
{%- if content is string -%}
|
| 19 |
+
{{- content }}
|
| 20 |
+
{%- elif content is iterable and content is not mapping -%}
|
| 21 |
+
{%- for item in content -%}
|
| 22 |
+
{%- if item is mapping and item.type == 'text' -%}
|
| 23 |
+
{{- item.text }}
|
| 24 |
+
{%- elif item is string -%}
|
| 25 |
+
{{- item }}
|
| 26 |
+
{%- endif -%}
|
| 27 |
+
{%- endfor -%}
|
| 28 |
+
{%- else -%}
|
| 29 |
+
{{- content }}
|
| 30 |
+
{%- endif -%}
|
| 31 |
+
{%- endmacro -%}
|
| 32 |
+
{%- set ns = namespace(last_user_index=-1) %}
|
| 33 |
+
{%- for m in messages %}
|
| 34 |
+
{%- if m.role == 'user' %}
|
| 35 |
+
{% set ns.last_user_index = loop.index0 -%}
|
| 36 |
+
{%- endif %}
|
| 37 |
+
{%- endfor %}
|
| 38 |
+
{% for m in messages %}
|
| 39 |
+
{%- if m.role == 'user' -%}<|user|>{{ visible_text(m.content) }}
|
| 40 |
+
{%- elif m.role == 'assistant' -%}
|
| 41 |
+
<|assistant|>
|
| 42 |
+
{%- set reasoning_content = '' %}
|
| 43 |
+
{%- set content = visible_text(m.content) %}
|
| 44 |
+
{%- if m.reasoning_content is string %}
|
| 45 |
+
{%- set reasoning_content = m.reasoning_content %}
|
| 46 |
+
{%- else %}
|
| 47 |
+
{%- if '</think>' in content %}
|
| 48 |
+
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
|
| 49 |
+
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
|
| 50 |
+
{%- endif %}
|
| 51 |
+
{%- endif %}
|
| 52 |
+
{%- if ((clear_thinking is defined and not clear_thinking) or loop.index0 > ns.last_user_index) and reasoning_content -%}
|
| 53 |
+
{{ '<think>' + reasoning_content.strip() + '</think>'}}
|
| 54 |
+
{%- else -%}
|
| 55 |
+
{{ '</think>' }}
|
| 56 |
+
{%- endif -%}
|
| 57 |
+
{%- if content.strip() -%}
|
| 58 |
+
{{ content.strip() }}
|
| 59 |
+
{%- endif -%}
|
| 60 |
+
{% if m.tool_calls %}
|
| 61 |
+
{% for tc in m.tool_calls %}
|
| 62 |
+
{%- if tc.function %}
|
| 63 |
+
{%- set tc = tc.function %}
|
| 64 |
+
{%- endif %}
|
| 65 |
+
{{- '<tool_call>' + tc.name -}}
|
| 66 |
+
{% set _args = tc.arguments %}{% for k, v in _args.items() %}<arg_key>{{ k }}</arg_key><arg_value>{{ v | tojson(ensure_ascii=False) if v is not string else v }}</arg_value>{% endfor %}</tool_call>{% endfor %}
|
| 67 |
+
{% endif %}
|
| 68 |
+
{%- elif m.role == 'tool' -%}
|
| 69 |
+
{%- if m.content is string -%}
|
| 70 |
+
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
|
| 71 |
+
{{- '<|observation|>' }}
|
| 72 |
+
{%- endif %}
|
| 73 |
+
{{- '<tool_response>' }}
|
| 74 |
+
{{- m.content }}
|
| 75 |
+
{{- '</tool_response>' }}
|
| 76 |
+
{%- else -%}
|
| 77 |
+
<|observation|>{% for tr in m.content %}
|
| 78 |
+
<tool_response>{{ tr.output if tr.output is defined else tr }}</tool_response>{% endfor -%}
|
| 79 |
+
{% endif -%}
|
| 80 |
+
{%- elif m.role == 'system' -%}
|
| 81 |
+
<|system|>{{ visible_text(m.content) }}
|
| 82 |
+
{%- endif -%}
|
| 83 |
+
{%- endfor -%}
|
| 84 |
+
{%- if add_generation_prompt -%}
|
| 85 |
+
<|assistant|>{{- '</think>' if (enable_thinking is defined and not enable_thinking) else '<think>' -}}
|
| 86 |
+
{%- endif -%}
|
config.json
ADDED
|
@@ -0,0 +1,78 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"Glm4MoeForCausalLM"
|
| 4 |
+
],
|
| 5 |
+
"attention_bias": true,
|
| 6 |
+
"attention_dropout": 0.0,
|
| 7 |
+
"dtype": "bfloat16",
|
| 8 |
+
"eos_token_id": [
|
| 9 |
+
151329,
|
| 10 |
+
151336,
|
| 11 |
+
151338
|
| 12 |
+
],
|
| 13 |
+
"first_k_dense_replace": 3,
|
| 14 |
+
"head_dim": 128,
|
| 15 |
+
"hidden_act": "silu",
|
| 16 |
+
"hidden_size": 5120,
|
| 17 |
+
"initializer_range": 0.02,
|
| 18 |
+
"intermediate_size": 12288,
|
| 19 |
+
"max_position_embeddings": 202752,
|
| 20 |
+
"model_type": "glm4_moe",
|
| 21 |
+
"moe_intermediate_size": 1536,
|
| 22 |
+
"n_group": 1,
|
| 23 |
+
"n_routed_experts": 160,
|
| 24 |
+
"n_shared_experts": 1,
|
| 25 |
+
"norm_topk_prob": true,
|
| 26 |
+
"num_attention_heads": 96,
|
| 27 |
+
"num_experts_per_tok": 8,
|
| 28 |
+
"num_hidden_layers": 92,
|
| 29 |
+
"num_key_value_heads": 8,
|
| 30 |
+
"num_nextn_predict_layers": 1,
|
| 31 |
+
"pad_token_id": 151329,
|
| 32 |
+
"partial_rotary_factor": 0.5,
|
| 33 |
+
"rms_norm_eps": 1e-05,
|
| 34 |
+
"rope_scaling": null,
|
| 35 |
+
"rope_theta": 1000000,
|
| 36 |
+
"routed_scaling_factor": 2.5,
|
| 37 |
+
"tie_word_embeddings": false,
|
| 38 |
+
"topk_group": 1,
|
| 39 |
+
"transformers_version": "4.57.3",
|
| 40 |
+
"use_cache": true,
|
| 41 |
+
"use_qk_norm": true,
|
| 42 |
+
"vocab_size": 151552,
|
| 43 |
+
"quantization_config": {
|
| 44 |
+
"config_groups": {
|
| 45 |
+
"group_0": {
|
| 46 |
+
"input_activations": {
|
| 47 |
+
"dynamic": false,
|
| 48 |
+
"num_bits": 4,
|
| 49 |
+
"type": "float",
|
| 50 |
+
"group_size": 16
|
| 51 |
+
},
|
| 52 |
+
"weights": {
|
| 53 |
+
"dynamic": false,
|
| 54 |
+
"num_bits": 4,
|
| 55 |
+
"type": "float",
|
| 56 |
+
"group_size": 16
|
| 57 |
+
},
|
| 58 |
+
"targets": [
|
| 59 |
+
"Linear"
|
| 60 |
+
]
|
| 61 |
+
}
|
| 62 |
+
},
|
| 63 |
+
"ignore": [
|
| 64 |
+
"lm_head"
|
| 65 |
+
],
|
| 66 |
+
"quant_algo": "NVFP4",
|
| 67 |
+
"kv_cache_scheme": {
|
| 68 |
+
"dynamic": false,
|
| 69 |
+
"num_bits": 8,
|
| 70 |
+
"type": "float"
|
| 71 |
+
},
|
| 72 |
+
"producer": {
|
| 73 |
+
"name": "modelopt",
|
| 74 |
+
"version": "0.40.0"
|
| 75 |
+
},
|
| 76 |
+
"quant_method": "modelopt"
|
| 77 |
+
}
|
| 78 |
+
}
|
generation_config.json
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_from_model_config": true,
|
| 3 |
+
"eos_token_id": [
|
| 4 |
+
151329,
|
| 5 |
+
151336,
|
| 6 |
+
151338
|
| 7 |
+
],
|
| 8 |
+
"pad_token_id": 151329,
|
| 9 |
+
"transformers_version": "4.57.3"
|
| 10 |
+
}
|
hf_quant_config.json
ADDED
|
@@ -0,0 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"producer": {
|
| 3 |
+
"name": "modelopt",
|
| 4 |
+
"version": "0.40.0"
|
| 5 |
+
},
|
| 6 |
+
"quantization": {
|
| 7 |
+
"quant_algo": "NVFP4",
|
| 8 |
+
"kv_cache_quant_algo": "FP8",
|
| 9 |
+
"group_size": 16,
|
| 10 |
+
"exclude_modules": [
|
| 11 |
+
"lm_head"
|
| 12 |
+
]
|
| 13 |
+
}
|
| 14 |
+
}
|
model-00001-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ecdd1683256b7b40ff3fc6706a8d64679c817954a3f28c6376dd44a8bf2bde02
|
| 3 |
+
size 4998646904
|
model-00002-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7332ecbba03b1d3e2ed5867cb3feec8512e001801cfffa0ef18c2c2ac31f39e4
|
| 3 |
+
size 4996766232
|
model-00003-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1378802395eb36a726535689b1cd969a3c32d678312444330b20f05b9c9aceb7
|
| 3 |
+
size 4996766592
|
model-00004-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1a6ad22ee2096c223dea8f29962a5f3c19955875d413f1b49b94a80cdaf38b44
|
| 3 |
+
size 4999926688
|
model-00005-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0681ab255b7fc4d7af2a8ceea54a07efa2e222ebdcde964755427f644cf5f987
|
| 3 |
+
size 4996770616
|
model-00006-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a4d1ecb6cda3dbb313048ad88809ddf0e293aead5e4989e2d0db55d870ed347e
|
| 3 |
+
size 4996770664
|
model-00007-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7d7c8710d3d2cc45b99cf9c42543cb3801f8a57b85936a78e0da10c12645c182
|
| 3 |
+
size 4996771072
|
model-00008-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cca67b7ac1b34fc58f01da2fa9b9c09011ec88a95e616e49c51e29999dac6df5
|
| 3 |
+
size 4999929272
|
model-00009-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aa844a2f98c79aff09b261e9172ef92f371af0279a38fa7be27b6bf40b0627eb
|
| 3 |
+
size 4996770656
|
model-00010-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6f5c230500588c7909a88aa9228a0a8105fb259b221d8579e54b70808c3d61cb
|
| 3 |
+
size 4996770664
|
model-00011-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3f55c35142f91600ab2542c8ae431214f88bb3046e3e27f2d74bab287065e974
|
| 3 |
+
size 4996771120
|
model-00012-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:360574e4ea8b00ac0fa8f777613f40836eaa58ffed210f8ac10718d3dc132fdb
|
| 3 |
+
size 4999929320
|
model-00013-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a419a12dade67dc55170911f3467979b65d28263cad5ba7232c75ff12b77046e
|
| 3 |
+
size 4996770656
|
model-00014-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b1b4f7c7a533dcafbaf08015c34d3e91d3d878b4b64ae4e3fd2cfb257b01aa46
|
| 3 |
+
size 4996770664
|
model-00015-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a6aba2420e1f643385e045972f351563b54091b3c80c07bf8af17a634f30b5d4
|
| 3 |
+
size 4996771168
|
model-00016-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b54246ec30f920d4d26707bb1a3cd06a5d5a0d3ed44c4d0a6ad54aa008ccafe4
|
| 3 |
+
size 4999929368
|
model-00017-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fbf0decdc2d36fb57929248c2a8dbfab6c76e972f5aea41f2e2ebff7949d1f07
|
| 3 |
+
size 4996770656
|
model-00018-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fa552fa21947d91ae2988f25993251dfe3843d4b0b81e1e73707b3cd0f0023e0
|
| 3 |
+
size 4996770712
|
model-00019-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3ccc47a26daf8413c09ae781029d872fe796ea38fee0eb9f464d97e5dd81517f
|
| 3 |
+
size 4996771168
|
model-00020-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:81a5dd01c6f29e751162f11286b3202790e2a0167930b21e74d38e23ae0c0e5c
|
| 3 |
+
size 4999929392
|
model-00021-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:694a1a2dba2e2e4157df4f769f194c20f382e07956456645c0a6db8c9e83afd1
|
| 3 |
+
size 4996770656
|
model-00022-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c5e6184d83be6bac104c735aa6d3f5be2b1ba331dfecafc0d87d5e723d1e278f
|
| 3 |
+
size 4996770760
|
model-00023-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ef2cf07d0f47987b3a52cb6a990bc66317244ae3b5c53b70a5d37fb0ed5bd48e
|
| 3 |
+
size 4996771168
|
model-00024-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:90bc0197e86b4ca3b711ff399cd33442f49d454e5f15c27f1609e150ea68f9cf
|
| 3 |
+
size 4999929344
|
model-00025-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ed1d4820cbd00513cf03d447300ffa541ea3cad8803b2e0d0a74d7b632b6f0e0
|
| 3 |
+
size 4996770656
|
model-00026-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:15067f644a73f9f35d5c6caad44c607b4595dffc0bf75fc62a3330450254440b
|
| 3 |
+
size 4996770808
|
model-00027-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:212e9ee30b2d5056b6d01232c4bf03b3a4df058d7dd0c7e54c79398090650e32
|
| 3 |
+
size 4996771168
|
model-00028-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aee3ea7ab93c63dd5be663478d0a068e344350c2cadc81548ae267ee7d86c172
|
| 3 |
+
size 4999929296
|
model-00029-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:32ceb322b8e6c17c533efa32fc17af4281f672366ed70df0a2c5c0770a0855c9
|
| 3 |
+
size 4996770656
|
model-00030-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3cdf400d6d4d6e4067c6f82fb373a19335142e1c7db4d450a8cfbdcdb821d70e
|
| 3 |
+
size 4996770856
|
model-00031-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:eb64026cca4492ef0dda5bb71e61d26713a4f16cc0dfecd41940d77cbb16da20
|
| 3 |
+
size 4996771168
|
model-00032-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c800bbb840f146c39963c14d43331c4a995e22ef71c7eb6a33b1a574a19e3eff
|
| 3 |
+
size 4999929288
|
model-00033-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:00e1c555bf6d05fc4e471cd9b88e38deb15bf92f1341caf2c94fc7691b0df499
|
| 3 |
+
size 4996770656
|
model-00034-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:44fb917f390670df7db92c945f77564e814e2f7fc6d485065feb205cc0e26e19
|
| 3 |
+
size 4996770904
|
model-00035-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aab7c2ac0ff4079843d4e907ea05f132957454ccb696dbd6e9f232d00739a568
|
| 3 |
+
size 4971886168
|
model-00036-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b218302bd7a6ece6ea466a8dce72c4a5db1ace94a4a7e389d88e584ab9b38db7
|
| 3 |
+
size 4998268584
|
model-00037-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:737aaee5fa92e6d87ce44bf9502fdeeefe721cd2e01a7becec33b35fa9148f0f
|
| 3 |
+
size 4996770656
|
model-00038-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e380e90aa7e65118491fdbb7c005c9a27ca6bd02d48a52ae3d58f961ea02647e
|
| 3 |
+
size 4996770928
|
model-00039-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4cbc3b41ebbc6325b22581f09845068852b77f7834349c511cf50a259672c236
|
| 3 |
+
size 4986659240
|
model-00040-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:76edabce20e8d04d46e8dd3cb0e835b8e59540908624d76d9d6d2570645cb398
|
| 3 |
+
size 4389170168
|
model-00041-of-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c5344ed0a545f40db7029547e1e473c2a306c1cc2084c5f4c97808f7705a74aa
|
| 3 |
+
size 1551892608
|
model.safetensors.index.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8c72a8cbe89ba70d649e76ed386882145d24f70cc04e4a8da5a3b5adeb128843
|
| 3 |
+
size 16609598
|
special_tokens_map.json
ADDED
|
@@ -0,0 +1,34 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"additional_special_tokens": [
|
| 3 |
+
"<|endoftext|>",
|
| 4 |
+
"[MASK]",
|
| 5 |
+
"[gMASK]",
|
| 6 |
+
"[sMASK]",
|
| 7 |
+
"<sop>",
|
| 8 |
+
"<eop>",
|
| 9 |
+
"<|system|>",
|
| 10 |
+
"<|user|>",
|
| 11 |
+
"<|assistant|>",
|
| 12 |
+
"<|observation|>",
|
| 13 |
+
"<|begin_of_image|>",
|
| 14 |
+
"<|end_of_image|>",
|
| 15 |
+
"<|begin_of_video|>",
|
| 16 |
+
"<|end_of_video|>",
|
| 17 |
+
"<|begin_of_audio|>",
|
| 18 |
+
"<|end_of_audio|>",
|
| 19 |
+
"<|begin_of_transcription|>",
|
| 20 |
+
"<|end_of_transcription|>",
|
| 21 |
+
"<|code_prefix|>",
|
| 22 |
+
"<|code_middle|>",
|
| 23 |
+
"<|code_suffix|>",
|
| 24 |
+
"/nothink"
|
| 25 |
+
],
|
| 26 |
+
"eos_token": {
|
| 27 |
+
"content": "<|endoftext|>",
|
| 28 |
+
"lstrip": false,
|
| 29 |
+
"normalized": false,
|
| 30 |
+
"rstrip": false,
|
| 31 |
+
"single_word": false
|
| 32 |
+
},
|
| 33 |
+
"pad_token": "<|endoftext|>"
|
| 34 |
+
}
|
tokenizer.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bda8e2146c3bb7b7e0fc96dcc4f0aeff041c6c27952e3ace0665663ebff346ba
|
| 3 |
+
size 19970700
|