Foros Robotics Action Engine
Foros is an ultra-compact 10M parameter instruction-to-JSON model designed for low-latency, on-device robotics control. It translates plain-English robot commands β including temporal loops, timed sequences, and FSM transitions β directly into structured JSON arrays of operations compatible with ROS / ROS2 and major industrial robot controllers (URScript, KRL, RAPID, Fanuc, DRL).
Developed by AMEFORGE β https://huggingface.co/AMFORGE. Built on the in-house SparseMind architecture (sparse token attention, sparse channel FFN, dynamic neuron typing).
Current version: v5.10 β production-ready, deployed on CPU / Jetson / Raspberry Pi 4.
Benchmark Results (Held-Out, 142 Curated Examples)
Foros is evaluated on a held-out test suite of 142 hand-curated robotics commands spanning 5 difficulty tiers. None of these prompts appear in the training corpus. All measurements taken on Kaggle T4 GPU, greedy decoding.
Per-Tier Breakdown β v5.10
| Tier | Description | N | Valid JSON | Op Correct | Exact Match |
|---|---|---|---|---|---|
| Tier 1 | Paraphrase (novel templates) | 32 | 100.0% | 100.0% | 100.0% |
| Tier 2 | Informal (natural language) | 29 | 100.0% | 96.6% | 93.1% |
| Tier 3 | Typos & noise robustness | 30 | 100.0% | 80.0% | 43.3% |
| Tier 4 | Multi-step sequences | 22 | 100.0% | 100.0% | 72.7% |
| Tier 5 | Long chains & temporal loops | 29 | 96.6% | 96.6% | 69.0% |
| Global (weighted) | 142 | 99.3% | 94.4% | 76.1% |
Version Trajectory
We report Exact Match on the held-out benchmark across successive iterations to document architectural and data improvements transparently:
| Version | Held-Out Exact Match | Held-Out JSON Valid | Notable Change |
|---|---|---|---|
| v5.4 (baseline) | 62.7% | ~99% | Initial production deployment |
| v5.9 | 63.4% | 100.0% | Numerical precision standardization, deterministic pick-and-place targets |
| v5.10 | 76.1% | 99.3% | Refactored conditional templates, expanded informal/imperative vocabulary, integer-form numerical prompts |
Head-to-Head Comparison (All Measured)
All baselines evaluated on the same 142-example held-out benchmark, same hardware (Kaggle T4 GPU), same scoring rubric, greedy decoding.
| Model | Exact Match | Valid JSON | Op Correct | Latency (avg) | Size |
|---|---|---|---|---|---|
| π Foros v5.10 β AMEFORGE | 76.1% | 99.3% | 94.4% | 508 ms | 39.6 MB |
| Qwen2.5-1.5B-Instruct | 28.0% | 60.0% | 44.0% | 1{,}998 ms | 2{,}944 MB |
| Qwen2.5-0.5B-Instruct | 18.0% | 46.0% | 24.0% | 3{,}766 ms | 942 MB |
| TinyLlama-1.1B-Chat | 6.0% | 22.0% | 10.0% | 7{,}315 ms | 2{,}098 MB |
| SmolLM2-360M-Instruct | 0.0% | 6.0% | 2.0% | 5{,}884 ms | 690 MB |
Key takeaways:
- Foros reaches 76.1% exact match on held-out robotics commands. The best general-purpose small LM evaluated (Qwen2.5-1.5B, ~150Γ larger) reaches only 28.0% β Foros outperforms it by +48 percentage points.
- The smallest comparable general-purpose LM (SmolLM2-360M, ~36Γ larger) reaches 0.0% exact match and only 6.0% valid JSON, indicating that general-purpose small models struggle even to produce syntactically valid output on this task.
- 4Γ lower latency than Qwen2.5-1.5B, 14Γ lower than TinyLlama-1.1B.
- 17Γ smaller than the smallest competitive baseline (SmolLM2-360M), 74Γ smaller than Qwen2.5-1.5B.
- Runs on Raspberry Pi 4, Jetson Nano/Orin, or any embedded CPU. No GPU required for inference, no cloud dependency, no telemetry.
Latency profile β atomic commands (Tier 1β3) run at ~305 ms, compound sequences (Tier 4β5) at ~860 ms. The 508 ms average reflects the full benchmark distribution including long temporal loops.
What it does
Atomic Commands
| Natural Language Input | Structured Output (ROS JSON) |
|---|---|
move to x=0.5 y=-1.2 z=0.8 |
[{"op":"move","x":0.5,"y":-1.2,"z":0.8}] |
rotate joints to [0.0, 45.0, 90.0, 0.0, 0.0, 0.0] |
[{"op":"joint_move","joints":[0.0,45.0,90.0,0.0,0.0,0.0]}] |
close gripper with force 0.75 |
[{"op":"gripper","action":"close","force":0.75}] |
wait for 3.5 seconds |
[{"op":"wait","seconds":3.5}] |
set velocity to 0.75 m/s |
[{"op":"speed","velocity":0.75}] |
halt all motion |
[{"op":"stop"}] |
upon sensor_trip return to home position |
[{"op":"safety","cond":"sensor_trip","then":[{"op":"home"}]}] |
Temporal / Loop Commands
| Natural Language Input | Structured Output |
|---|---|
repeat 5 times: move arm |
[{"op":"repeat","times":5,"body":[...]}] |
keep doing move arm until obstacle |
[{"op":"repeat_until","cond":"obstacle","body":[...]}] |
run control loop at 100Hz for 2.5 seconds |
[{"op":"control_loop","frequency_hz":100,"duration_s":2.5,"body":[...]}] |
every 0.5s do rotate joints for 4 steps |
[{"op":"timed_seq","interval_s":0.5,"count":4,"body":[...]}] |
simultaneously move arm and set speed |
[{"op":"parallel","branches":[[...],[...]]}] |
Complex Sequences (Multi-step planning)
Input: pick up the red_box at 0.5 0.5 0.0 and place it at -0.5 1.0 0.0
Output: [
{"op":"move","x":0.5,"y":0.5,"z":0.0},
{"op":"gripper","action":"close"},
{"op":"move","x":0.5,"y":0.5,"z":0.2},
{"op":"move","x":-0.5,"y":1.0,"z":0.2},
{"op":"move","x":-0.5,"y":1.0,"z":0.0},
{"op":"gripper","action":"open"},
{"op":"move","x":-0.5,"y":1.0,"z":0.2}
]
Supported Operations
| Category | Operations |
|---|---|
| Motion | move, joint_move, move_tcp, move_joint, home, trajectory |
| End Effector | gripper, tool, get_joint_values |
| Control Flow | wait, safety, stop, repeat, repeat_until |
| Temporal | timed_seq, control_loop, parallel, state_transition |
Model Details
| Property | Value |
|---|---|
| Architecture | SparseMind (decoder-only, sparse attention + sparse FFN + dynamic neuron typing) |
| Parameters | 10,347,395 (~10.3 M) |
| Hidden size / Layers / Heads | 256 / 6 / 8 |
| Context length | 384 tokens |
| Tokenizer | In-house domain-specific BPE, vocab 3,000, atomic numerical tokens |
| Precision | FP32 |
| Model size | 39.6 MB |
Training Methodology
Foros is trained on a hybrid corpus combining:
- Programmatic synthetic data covering all supported operations, with paraphrastic variations (formal, informal, imperative tones), numerical precision variants, and compositional sequences of varying depth.
- Curated production logs β anonymized real-world prompts collected from deployed instances, with manually verified ground-truth JSON targets.
- Iterative refinement β successive versions (v5.4 β v5.9 β v5.10) integrate fixes derived from systematic failure analysis on the held-out benchmark.
Training is conducted from scratch (no pre-trained checkpoint) on a single T4 GPU in approximately 4 hours.
Detailed corpus composition, generator weights, and hyperparameter schedules are proprietary to AMEFORGE.
Known Limitations
- Typo robustness β Tier 3 sits at 43.3% exact match. Severely mangled
tokens (e.g.,
mvoeinstead ofmove) can degrade numerical extraction. A typo-aware fine-tune is planned for v5.11. - Relative motion β Foros operates on absolute coordinates. Prompts like
move left by 20 cmare out of domain and should be resolved by an upstream natural-language pre-processor that converts them to absolute positions. - Open-ended planning β Foros is a structured translator, not a planner. For multi-step reasoning beyond literal sequencing, pair it with an upstream planner.
- Numerical fidelity in low-confidence contexts β when the prompt vocabulary is unfamiliar, the model may default to in-distribution coordinate values. For coordinate-critical operations in production, we recommend a lightweight regex post-processor that re-injects explicit numerical values from the prompt as a safety net.
Local Inference
import os
import torch
import sentencepiece as spm
from huggingface_hub import hf_hub_download
# Download model weights (public)
model_file = hf_hub_download(repo_id="AMFORGE/foros", filename="foros.pt")
# Download tokenizer (gated β set HF_TOKEN environment variable)
tok_file = hf_hub_download(
repo_id="AMFORGE/foros_tok",
filename="sparsforos_tokenizer.model",
token=os.environ.get("HF_TOKEN"),
)
# Tokenizer
sp = spm.SentencePieceProcessor()
sp.Load(tok_file)
# Model β requires the SparseMind reference implementation
# (available with the tokenizer via AMEFORGE on request)
from sparsemind_robotics_train import SparseMind, Config
ckpt = torch.load(model_file, map_location="cpu", weights_only=False)
cfg = Config(**{k: v for k, v in ckpt["config"].items()
if k in Config.__dataclass_fields__})
model = SparseMind(cfg)
model.load_state_dict(ckpt["model"])
model.eval()
# Inference β greedy decoding recommended for production
prompt = "move to x=0.5 y=-1.2 z=0.8 =>"
input_ids = torch.tensor([sp.EncodeAsIds(prompt)])
out_ids = model.generate(input_ids, max_new=128, temp=1.0, top_k=1)
result = sp.DecodeIds(out_ids[0, input_ids.shape[1]:].tolist())
print(result)
# [{"op":"move","x":0.5,"y":-1.2,"z":0.8}]
Citation
@misc{foros_robotics_v5_10,
title = {Foros v5.10: An On-Device Instruction-to-JSON Engine for Robotics},
author = {AMEFORGE},
year = {2026},
note = {Built on the SparseMind architecture.
https://huggingface.co/AMFORGE/foros}
}
License & Contact
- Model weights: Apache 2.0
- Tokenizer: gated access β contact AMEFORGE
- Inquiries: https://huggingface.co/AMFORGE
- Downloads last month
- 12