Forge 1 Mini

Forge 1 Mini is a tiny Forge-series chat model. It is intended for basic chat, simple completions, rewriting, classification, routing, and short direct answers.

This repo includes:

  • model.safetensors: corrected Hugging Face checkpoint.
  • tokenizer.model: SentencePiece tokenizer with ChatML markers.
  • forge-1-mini-f16.gguf: llama.cpp-compatible F16 GGUF.

llama.cpp / llama-cpp-python

Use the embedded ChatML template and stop on <|im_end|>.

from llama_cpp import Llama

llm = Llama(model_path="forge-1-mini-f16.gguf", n_ctx=512)
out = llm.create_chat_completion(
    messages=[{"role": "user", "content": "What is 2 + 2?"}],
    max_tokens=96,
    temperature=0.0,
    stop=["<|im_end|>"],
)
print(out["choices"][0]["message"]["content"].strip())

Expected answer:

4

Local Verification

The uploaded GGUF passed a llama.cpp smoke test using llama-cpp-python tokenization and greedy sampling:

Who are you? -> I am Forge-1-Mini, a tiny local assistant created by Arthur / North ML.
Hi -> Hi! I am Forge-1-Mini. How can I help?
What is 2 + 2? -> 4
Write a Python function that adds two numbers. -> def add(a, b): return a + b
Who is Jesus? -> Christians believe Jesus Christ is the eternal Son of God...
How should I treat someone I disagree with? -> Treat the person with dignity...

Model Family Notes

Model Parameters Hosting Estimated Cost per 1M Output Tokens Ability
Forge 1 Mini 5.2M Open-source, can host anywhere. $0.01-$0.05 Basic chat, simple completions, rewriting, classification, routing, and short direct answers
Forge 1 ~40M Open-source, can host anywhere. $0.10-$0.30 Better conversational ability, basic coding, structured responses, simple reasoning, and tool routing
Forge 1 Reasoning ~40M Hosted on North servers, proprietary. $0.20-$1.00 Reasoning-tuned checkpoint with planning, self-checking, multiple-pass generation, and priority processing
Forge 1 Ultra ~150M Hosted on North servers, proprietary. $0.15-$0.80 Strongest native Forge model; better coding, instruction following, longer responses, tool use, and software-engineering tasks
Downloads last month
2
Safetensors
Model size
5.19M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for North-ML1/Forge-1-Mini

Quantizations
1 model