Forge 1 Mini
Forge 1 Mini is a tiny Forge-series chat model. It is intended for basic chat, simple completions, rewriting, classification, routing, and short direct answers.
This repo includes:
model.safetensors: corrected Hugging Face checkpoint.tokenizer.model: SentencePiece tokenizer with ChatML markers.forge-1-mini-f16.gguf: llama.cpp-compatible F16 GGUF.
llama.cpp / llama-cpp-python
Use the embedded ChatML template and stop on <|im_end|>.
from llama_cpp import Llama
llm = Llama(model_path="forge-1-mini-f16.gguf", n_ctx=512)
out = llm.create_chat_completion(
messages=[{"role": "user", "content": "What is 2 + 2?"}],
max_tokens=96,
temperature=0.0,
stop=["<|im_end|>"],
)
print(out["choices"][0]["message"]["content"].strip())
Expected answer:
4
Local Verification
The uploaded GGUF passed a llama.cpp smoke test using llama-cpp-python tokenization and greedy sampling:
Who are you? -> I am Forge-1-Mini, a tiny local assistant created by Arthur / North ML.
Hi -> Hi! I am Forge-1-Mini. How can I help?
What is 2 + 2? -> 4
Write a Python function that adds two numbers. -> def add(a, b): return a + b
Who is Jesus? -> Christians believe Jesus Christ is the eternal Son of God...
How should I treat someone I disagree with? -> Treat the person with dignity...
Model Family Notes
| Model | Parameters | Hosting | Estimated Cost per 1M Output Tokens | Ability |
|---|---|---|---|---|
| Forge 1 Mini | 5.2M | Open-source, can host anywhere. | $0.01-$0.05 | Basic chat, simple completions, rewriting, classification, routing, and short direct answers |
| Forge 1 | ~40M | Open-source, can host anywhere. | $0.10-$0.30 | Better conversational ability, basic coding, structured responses, simple reasoning, and tool routing |
| Forge 1 Reasoning | ~40M | Hosted on North servers, proprietary. | $0.20-$1.00 | Reasoning-tuned checkpoint with planning, self-checking, multiple-pass generation, and priority processing |
| Forge 1 Ultra | ~150M | Hosted on North servers, proprietary. | $0.15-$0.80 | Strongest native Forge model; better coding, instruction following, longer responses, tool use, and software-engineering tasks |
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support