2 OLMo 2 Furious
Paper
•
2501.00656
•
Published
•
22
This is a 4-bit AWQ quantized version of allenai/OLMo-2-0325-32B-Instruct using LLM Compressor.
| Metric | Value |
|---|---|
| Original (BF16) | ~64.0 GB |
| Quantized (W4A16) | ~16.91 GB |
| Reduction | ~73.6% |
| Memory Saved | ~47.1 GB |
OLMo 2 is a series of fully open language models by the Allen Institute for AI:
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
"ronantakizawa/olmo2-32b-instruct-awq-w4a16",
trust_remote_code=True,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
"ronantakizawa/olmo2-32b-instruct-awq-w4a16",
trust_remote_code=True
)
# Chat template
messages = [
{"role": "user", "content": "Explain quantum computing in simple terms."}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
The model uses this chat template format:
<|user|>
How are you doing?
<|assistant|>
I'm just a computer program, so I don't have feelings, but I'm functioning as expected. How can I assist you today?<|endoftext|>
In Ai2 demos, this system prompt is used by default:
You are OLMo 2, a helpful and harmless AI Assistant built by the Allen Institute for AI.
However, the model has not been trained with a specific system prompt requirement.
Apache 2.0 (same as base model)
@article{olmo20242olmo2furious,
title={2 OLMo 2 Furious},
author={Team OLMo and Pete Walsh and Luca Soldaini and others},
year={2024},
eprint={2501.00656},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.00656},
}
@misc{olmo2-32b-awq,
title={OLMo-2-0325-32B-Instruct AWQ 4-bit},
author={Quantized by ronantakizawa},
year={2025},
url={https://huggingface.co/ronantakizawa/olmo2-32b-instruct-awq-w4a16}
}
🤖 Generated with LLM Compressor
Base model
allenai/OLMo-2-0325-32B