Twkeed-GPT-120B (توكيد)
A powerful Arabic language model based on GPT-OSS 120B (117B parameters, 5.1B active via MoE).
Fine-tuned on Saudi Arabian content including:
- Saudi Labor Law articles (Articles 53, 84-85, 109, 151)
- Saudi dialect understanding (يبي، وش، وين، الحين)
- Arabic grammar and writing
- Vision 2030 knowledge
Model Details
- Base Model: mlx-community/gpt-oss-120b-MXFP4-Q8
- Architecture: Mixture of Experts (117B total, 5.1B active)
- Reasoning: Near o4-mini level capabilities
- Fine-tuning Method: LoRA (r=8, alpha=16) with unsloth-mlx
- Training Hardware: Mac Studio M3 Ultra 96GB
- Language: Arabic (Modern Standard + Saudi Dialect)
Model Identity
This model identifies as توكيد (Twkeed) - an Arabic AI assistant with strong reasoning capabilities.
When asked "من أنت؟" (Who are you?), the model responds with its identity.
Usage
from mlx_lm import load, generate
model, tokenizer = load("twkeed-sa/twkeed-gpt-120b")
response = generate(
model,
tokenizer,
prompt="مرحباً، من أنت؟",
max_tokens=200,
)
print(response)
Why 120B?
| Aspect | 20B | 120B |
|---|---|---|
| Parameters | 21B (3.6B active) | 117B (5.1B active) |
| Reasoning | Good | Excellent (near o4-mini) |
| Arabic Knowledge | Very Good | Excellent |
The 120B base model provides much stronger reasoning and knowledge, requiring less fine-tuning adaptation.
Training Data
Fine-tuned on 40,000+ Arabic examples including:
- Arabic Alpaca dataset
- Custom Saudi Labor Law content
- Saudi dialect examples
- Arabic grammar instruction data
License
Apache 2.0
Author
Fine-tuned using unsloth-mlx
- Downloads last month
- 27
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for twkeed-sa/twkeed-gpt-120b
Base model
openai/gpt-oss-120b
Quantized
mlx-community/gpt-oss-120b-MXFP4-Q8