π§² TRIGNUM-300M
The Pre-Flight Check for Autonomous AI
"You wouldn't let a plane take off without a pre-flight check.
Why are we letting AI agents act without one?"
What Is This?
TRIGNUM-300M is a zero-model reasoning integrity validator for LLM outputs. It catches structural logic failures β contradictions, circular reasoning, non-sequiturs β before an AI agent acts on them.
from trignum_core.subtractive_filter import SubtractiveFilter
sf = SubtractiveFilter()
result = sf.apply(agent_output)
if result.illogics_found:
agent.halt(reason=result.illogics_found)
# T-CHIP glows RED π΄ β Human review required
else:
agent.execute()
# T-CHIP glows BLUE π΅ β Cleared for takeoff
No LLM. No API. No training data. ~300 lines of Python. <1ms.
π¬ Benchmark Results
We expanded our evaluation to 58,000+ real LLM outputs including a new 517-sample curated dataset for structural reasoning. Honest results:
| Benchmark | Samples | Precision | Recall | F1 | Speed |
|---|---|---|---|---|---|
| Structural illogic (curated) | 517 | 100% | 98.9% | 99.5% | <1ms |
| HaluEval (full dataset) | 58,293 | 60% | 2.1% | 4.0% | 706ms |
What this means:
- 99.5% F1 on structural reasoning failures β contradictions, circular logic, unsupported conclusions
- 4.0% F1 on factual hallucinations β we don't catch wrong facts
That's the point. There are 100 tools for fact-checking. There are zero tools for reasoning-checking. Until now.
Per-Task Breakdown (HaluEval)
| Task | n | Precision | Recall | F1 |
|---|---|---|---|---|
| QA | 18,316 | 83.3% | 0.25% | 0.50% |
| Dialogue | 19,977 | 60.1% | 4.38% | 8.16% |
| Summarization | 20,000 | 57.4% | 1.60% | 3.11% |
Throughput: 146,866 samples/second β orders of magnitude faster than LLM-based validation.
βοΈ The Pre-Flight Check Analogy
A pre-flight checklist doesn't verify that London exists. It verifies that:
- β Instruments don't contradict each other
- β There are no circular faults (sensor A confirms B confirms A)
- β The flight computer draws conclusions from actual data
- β Systems are logically consistent
The Subtractive Filter does the same for AI reasoning:
LLM Output β Subtractive Filter β [PASS] π΅ β Agent Executes
β [FAIL] π΄ β Agent Halts β Human Review
π€ The Missing "Agentic Validator"
In the context of the recent shift towards Agentic Reasoning, autonomous LLMs are moving from static prompts to dynamic thought-action loops involving planning, tool-use, and multi-agent collaboration.
Current systems rely heavily on probabilistic models to act as the "Critic/Evaluator" or use "Validator-Driven Feedback" via unit tests for code or simulators for robotics. But there has been no validator for pure logic. If an agent hallucinates a non-sequitur or circular justification during its internal planning phase, the error cascades.
TRIGNUM-300M fills this exact gap. It acts as a deterministic, <1ms Validator-Driven Feedback gate. It halts execution if the agent's internal thought (zt) contains a structural illogic, providing an immediate failure signal (rt = 0) before the agent commits to an irreversible external action (at).
πΊ Core Architecture
The Trignum Pyramid
Three faces acting as magnetic poles for data separation:
| Face | Role | What It Does |
|---|---|---|
| Ξ± (Logic) | Truth detection | Identifies structurally sound reasoning |
| Ξ² (Illogic) | Error detection | Catches contradictions, circular logic, non-sequiturs |
| Ξ³ (Context) | Human grounding | Anchors output to human intent |
T-CHIP: The Tensor Character
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β T-CHIP [v.300M] β
β β
β π΅ Blue = Logic Stable (Cleared for Takeoff) β
β π΄ Red = Illogic Detected (THE FREEZE) β
β π‘ Gold = Human Pulse Locked (Sovereign Override) β
β β
β Response time: <1ms | False alarms: 0% (structural) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The Subtractive Filter
Four detection layers, all pattern-based:
| Layer | Catches | Method |
|---|---|---|
| Contradiction | "X is always true. X is never true." | Antonym pairs, negation patterns |
| Circular Logic | A proves B proves A | Reference chain analysis |
| Non-Sequitur | "Therefore X" without premises | Causal connective analysis |
| Depth Check | Claims without any reasoning | Assertion density scoring |
π¦ Repository Structure
TRIGNUM-300M-TCHIP/
βββ src/
β βββ trignum_core/ # Core Python library
β βββ pyramid.py # Trignum Pyramid (3 magnetic faces)
β βββ tchip.py # T-CHIP (glow states)
β βββ subtractive_filter.py # β
The Subtractive Filter
β βββ human_pulse.py # Human sovereignty layer
β βββ magnetic_trillage.py # Data separation
βββ tests/ # 34 unit tests (all passing)
βββ benchmarks/
β βββ hallucination_benchmark.py # Curated structural test
β βββ full_halueval_benchmark.py # Full 58K HaluEval test
β βββ results.json # Structural benchmark results
β βββ full_halueval_results.json # Full HaluEval results
βββ demo/
β βββ index.html # Three.js 3D interactive demo
βββ paper/
β βββ TRIGNUM_300M_Position_Paper.md # Position paper
βββ docs/
β βββ theory/ # 6 foundational theory documents
βββ T-CHIP CLEARED FOR TAKEOFF.md # The pitch
βββ ROADMAP.md # 2-quarter development plan
π Quick Start
# Clone
git clone https://github.com/trace-on-lab/trignum-300m.git
cd trignum-300m
# Install
pip install -r requirements.txt
pip install -e .
# Run the structural benchmark
python benchmarks/hallucination_benchmark.py
# Run the full HaluEval benchmark (downloads ~13MB of data)
python benchmarks/full_halueval_benchmark.py
# Run tests
pytest tests/ -v
π Prior Art: Nobody Is Doing This
We searched arXiv, ResearchGate, ACL Anthology, and Semantic Scholar. Every existing reasoning validation system requires model inference:
| System | Requires Model | Validates Reasoning |
|---|---|---|
| VerifyLLM (2025) | β Yes | Partially |
| ContraGen | β Yes | Partially |
| Process Supervision (OpenAI) | β Yes | Yes |
| Guardrails AI | β Configurable | No (content) |
| Subtractive Filter | β No | β Yes |
Existing work uses LLMs to check LLMs. TRIGNUM uses logic to check LLMs.
Read the full analysis in our position paper.
βοΈ Quantum Integration: TQPE
TRIGNUM-300M serves as Phase 1 ("Technical A Priori Validation") for Trignumental Quantum Phase Estimation (TQPE).
In our groundbreaking case study estimating the ground state energy of the Hβ molecule, TRIGNUM successfully validated the physical consistency and structural logic of the quantum circuit before execution. By acting as the preliminary gatekeeper, TRIGNUM ensured that no quantum resources were wasted on structurally ill-formed configurations, enabling an epistemic confidence score of 82.8% on the final estimate (-1.1384 Ha).
Read the full BUILDING THE BRIDGE paper on Trignumentality and TQPE in the foundational Trignumentality repository.
π Documentation
| Document | Description |
|---|---|
| Core Postulate | The fundamental axioms of Trignum |
| Three Faces | Ξ± (Logic), Ξ² (Illogic), Ξ³ (Context) |
| Magnetic Trillage | Data separation mechanism |
| T-CHIP Spec | The Tensor Character in detail |
| Cold State Hardware | Hardware implications |
| Hallucination Paradox | Reframing the "Big Monster" |
| Position Paper | Full academic paper with benchmarks |
| Roadmap | 2-quarter development plan |
π The Golden Gems
| Gem | Wisdom |
|---|---|
| GEM 1 | "The Human Pulse is the Master Clock" |
| GEM 2 | "The Illogic is the Compass" |
| GEM 3 | "Magnetic Trillage Over Brute Force" |
| GEM 4 | "The Hallucination is the Raw Material" |
| GEM 5 | "T-CHIP is the Mirror" |
π€ Contributing
See CONTRIBUTING.md for guidelines.
π License
MIT License β see LICENSE.
π Contact
TRACE ON LAB
π§ traceonlab@proton.me
π‘οΈ The Call
"The most dangerous AI failure is not a wrong fact. It is reasoning that sounds right but isn't."
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π§² TRACE ON LAB β TRIGNUM-300M β v.300M β
β β
β The Pre-Flight Check for Autonomous AI. β
β Zero models. Zero API calls. 146,866 samples/second. β
β β
β π΅ T-CHIP: CLEARED FOR TAKEOFF. β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Star this repo if you believe AI should check its logic before it acts.