image


๐Ÿงฎ OmniMath-2B

OmniMath-2B-Pro is an improved and compact model capable of strong mathematical analysis, debugged based on the Qwen3.5โ€‘2B hybrid architecture (Gated Delta Networks interleaved with standard attention). Trained on the basis of 100,000+ Carefully selected mathematical problems from various datasets, it does an excellent job with step-by-step solutions, problems with arithmetic words, geometric reasoning and error recovery.

The main feature of OmniMath-2B-Pro is its ability to solve Olympiad tasks.

Unlike the previous version, the new OmniMath Pro shows the best results on mathematical benchmarks.

Despite its small size, OmniMath-2B-Pro demonstrates high performance and is ideal for resource-constrained and advanced deployment environments.


โœจ Key Features

  • Efficient 2B Scale : Only 2 billion parameters โ€“ runs smoothly on a single T4 GPU or even CPU with quantization.
  • Stepโ€‘byโ€‘Step Reasoning : Trained with explicit <think>...</think>โ€‘style chainโ€‘ofโ€‘thought prompts.
  • Hybrid Architecture : Inherits Qwen3.5's Gated Delta Networks for efficient longโ€‘context processing.

๐Ÿ“Š Benchmarks

Preliminary results (evaluation ongoing).

Model Size (params) GSM8K Accuracy
Qwen2.5-Math-1.5B 1.5B 54%
Phi-2 (0-shot CoT) 2.7B 50.0%
OmniMath-2B 2B ???%
dolphin-2_6-phi-2 2.7B 58.07%
Qwen2.5-0.5B-Instruct 2.7B 49.6%
gemma-3-1b-it 1.1B 62.8%
MobileLLM-R1.5 950M 1B 52.8%
Gemma 2 2B IT 2B 23.9%

Updates coming soon.


๐Ÿš€ Quickstart

๐Ÿค— Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "ZirTech/OmniMath-2B-Pro"  

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are a helpful math assistant. Solve problems step by step."},
    {"role": "user", "content": "A store sells apples for $2 each. If you buy 5 apples, how much do you pay?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.6, top_p=0.95, top_k=20)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))

โšก vLLM

vllm serve ZirTech/OmniMath-2B-Pro --tensor-parallel-size 1 --max-model-len 4096
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ZirTech/OmniMath-2B-Pro"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)
model.eval()

def ask(question):
    prompt = f"<|im_start|>system\nYou are a helpful math assistant.<|im_end|>\n<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant\n"
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.0, do_sample=False)
    response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
    if "user" in response:
        response = response.split("user")[0].strip()
    return response

print(ask("Find the degree for the given field extension Q(sqrt(2), sqrt(3), sqrt(18)) over Q. Give me the answer."))

๐Ÿ—๏ธ Architecture

OmniMathโ€‘2B-Pro fully preserves Qwen3.5โ€‘2B's design:

  • Gated Delta Networks : Linear attention layers interleaved with standard attention.

  • 262K Native Context : Supports up to 262,144 tokens (extendable with YaRN).

  • Built on Qwen3_5ForCausalLM : Seamless integration with Hugging Face ecosystem.


โš ๏ธ Limitations

  • Numerical accuracy may occasionally falter โ€“ always doubleโ€‘check critical calculations.

  • Geometry with visual elements was only trained on textual descriptions; performance on imageโ€‘based geometry is limited.

  • Nonโ€‘English math problems are not thoroughly evaluated.


๐Ÿ™ Acknowledgments

  • Qwen Team for the outstanding Qwen3.5 base models.

  • Hugging Face for dataset hosting and the Transformers library.

  • Thank the community for supporting OmniMath-2B model.


๐Ÿ“– Citation

@misc{omnimathpro2b2026,
  title={OmniMath-2B-Pro: A Strong and Lightweight Open-Source Mathematical Model},
  author={Zirt Techniques},
  year={2026},
  url={https://huggingface.co/ZirTech/OmniMath-2B-Pro}
}

Built by Zirt Tech โค๏ธ

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support