🛠️

The Programming Framework

A Universal Method for Process Analysis

Combining Large Language Models with Mermaid visualization to dissect and understand complex processes across any discipline—from biology to business, physics to psychology.

📋 Summary

The Programming Framework is a universal meta-tool for analyzing complex processes across any discipline by combining Large Language Models (LLMs) with visual flowchart representation. The Framework transforms textual process descriptions into structured, interactive Mermaid flowcharts stored as JSON, enabling systematic analysis, visualization, and integration with knowledge systems.

Successfully demonstrated through GLMP (Genome Logic Modeling Project) with 100+ regulatory-process flowcharts, and extended to mathematics (algorithms plus axiomatic dependency graphs), chemistry, physics, and computer science. The Framework serves as the foundational methodology for the CopernicusAI Knowledge Engine.

Foundational typology (2026): GLMP Foundational Typology (Primitive Relations and Computational Complexity) bridges the public Algorithms and Axiomatic Theories table and the GLMP database table (regulatory algorithms). This Space remains the hub for interactive viewers; the GCS tables are the authoritative machine-readable indices.

📚 Prior Work & Research Contributions

Overview

The Programming Framework represents prior work that demonstrates a novel methodology for analyzing complex processes by combining Large Language Models (LLMs) with visual flowchart representation. This research establishes a universal, domain-agnostic approach to process analysis that transforms textual descriptions into structured, interactive visualizations.

🔬 Research Contributions

  • Universal Process Analysis: Domain-agnostic methodology applicable across multiple fields
  • LLM-Powered Extraction: Automated extraction using Google Gemini 2.0 Flash
  • Structured Visualization: Mermaid.js-based flowchart generation encoded as JSON
  • Iterative Refinement: Systematic approach enabling continuous improvement
  • Scale Demonstration: Applied to 313+ processes across 5 disciplines (Biology: 52, Chemistry: 91, Physics: 21, Computer Science: 21, Mathematics: 20, GLMP: 109)
  • Validation: Successfully processes complex biological, chemical, and computational workflows with high accuracy

⚙️ Technical Achievements

  • Meta-Tool Architecture: Framework for creating specialized analysis tools
  • JSON-Based Storage: Structured format enabling version control and API integration
  • Multi-Domain Application: Successfully applied to biological processes (GLMP)
  • Integration Framework: Designed for knowledge engines and collaborative platforms

🎯 Position Within CopernicusAI Knowledge Engine

The Programming Framework serves as the foundational meta-tool of the CopernicusAI Knowledge Engine, providing the underlying methodology that enables specialized applications:

  • • GLMP (Genome Logic Modeling Project)
  • • CopernicusAI (main knowledge engine)
  • • Research Papers Metadata Database
  • • Science Video Database
  • • Multi-domain process analysis

This work establishes a proof-of-concept for AI-assisted process analysis, demonstrating how LLMs can systematically extract and visualize complex logic from textual sources across diverse domains.

Any
Discipline
LLM
Powered
Visual
Flowcharts
JSON
Structured Data

🎯 What is the Programming Framework?

The Programming Framework is a meta-tool—a tool for creating tools. It provides a systematic method for analyzing any complex process by combining the analytical power of Large Language Models with the clarity of visual flowcharts.

🔍 The Problem

Complex processes—whether biological, computational, or organizational—are difficult to understand because they involve many steps, decision points, and interactions. Traditional descriptions in text are hard to follow.

✨ The Solution

Use LLMs to extract process logic from literature, then encode it as Mermaid flowcharts stored in JSON. Result: Clear, interactive visualizations that reveal hidden patterns and enable systematic analysis.

⚙️ How It Works

1️⃣

Input Process

Provide scientific papers, documentation, or process descriptions

2️⃣

LLM Analysis

AI extracts steps, decisions, branches, and logic flow

3️⃣

Generate Flowchart

Create Mermaid diagram encoded as JSON structure

4️⃣

Visualize & Iterate

Interactive flowchart reveals insights and enables refinement

📝 Concrete Example:

Input:

"DNA replication begins when the origin recognition complex (ORC) binds to DNA replication origins. This triggers the loading of the MCM2-7 helicase complex, which unwinds the DNA double helix. DNA polymerases then synthesize new strands using the unwound strands as templates..."

LLM Analysis:

Extracts 15 steps, identifies 3 decision points (origin recognition, helicase loading, polymerase binding), recognizes 4 key enzymes (ORC, MCM2-7, DNA polymerase, ligase), and maps regulatory checkpoints.

Output:

Mermaid flowchart with 25 nodes, 28 edges, 3 decision gates, properly colored using the 5-color scheme (red for inputs, yellow for structures, green for operations, blue for intermediates, violet for products), stored as structured JSON enabling interactive visualization and programmatic access.

📊 Live Interactive Example:

graph TD A[Complex Process Input] --> B{LLM Analysis} B -->|Extract Logic| C[Identify Steps] B -->|Extract Decisions| D[Identify Branches] C --> E[Create Flowchart Nodes] D --> F[Create Decision Points] E --> G[Generate Mermaid Syntax] F --> G G --> H[Store as JSON] H --> I[Interactive Visualization] I --> J{Insights Gained?} J -->|No| K[Refine Analysis] J -->|Yes| L[Apply Knowledge] K --> B style A fill:#ff6b6b,color:#fff style B fill:#74c0fc,color:#fff style C fill:#51cf66,color:#fff style D fill:#51cf66,color:#fff style E fill:#ffd43b,color:#000 style F fill:#ffd43b,color:#000 style G fill:#51cf66,color:#fff style H fill:#74c0fc,color:#fff style I fill:#74c0fc,color:#fff style J fill:#74c0fc,color:#fff style K fill:#51cf66,color:#fff style L fill:#b197fc,color:#fff

Color Legend:

Red - Triggers & Inputs Yellow - Structures & Objects Green - Processing & Operations Blue - Intermediates & States Violet - Products & Outputs

💡 Core Principles

🌍

Domain Agnostic

Works across any field: biology, chemistry, software engineering, business processes, legal workflows, manufacturing, and beyond.

🔄

Iterative Refinement

Start with rough analysis, visualize, identify gaps, refine with LLM, repeat until the process logic is crystal clear.

📦

Structured Data

JSON storage enables programmatic access, version control, cross-referencing, and integration with other tools and databases.

📚 Process Diagram Collections

The Programming Framework has been applied across multiple scientific disciplines. Explore interactive flowchart collections organized by domain:

🧬 Biology

GLMP regulatory flowcharts (gene circuits, logic gates) are indexed in the GLMP table; higher-level organismal processes use the Biology Processes database. Both use the same Mermaid idiom as the mathematics corpus (see the 2026 typology paper).

GLMP table: sortable metadata and viewers for 100+ regulatory processes · GLMP Space

⚗️ Chemistry Under construction

Hugging Face batch pages are being reorganized. The live chemistry index and metadata live on Google Cloud Storage.

🗄️ Chemistry Database Table →

Growing collection; see the table for current counts and subcategories. · Local batch index (preview)

🔢 Mathematics

Algorithms and Axiomatic Theories: procedural flowcharts plus axiom–theorem dependency graphs, indexed in the mathematics processes database (see the 2026 foundational typology paper).

🗄️ Mathematics Database Table →

Local batch index on this Space · Working paper

⚛️ Physics Under construction

Hugging Face batch pages are under construction. The physics database table on GCS remains the primary index.

🗄️ Physics Database Table →

Local batch index (preview)

💻 Computer Science Under construction

Hugging Face batch pages are under construction. The computer science database table on GCS remains the primary index.

🗄️ Computer Science Database Table →

Local batch index (preview)

⚙️ Technical Architecture

🤖 LLM Integration

  • • Google Gemini 2.0 Flash for analysis
  • • Vertex AI for enterprise deployment
  • • Custom prompts for process extraction
  • • Structured JSON output formatting

📊 Visualization Stack

  • • Mermaid.js for flowchart rendering
  • • JSON schema for data validation
  • • Interactive SVG output
  • • Export to PNG/PDF supported

💾 Data Storage

  • • Google Cloud Storage for JSON files
  • • Firestore for metadata indexing
  • • Version control with Git
  • • Cross-referencing with papers database

🔗 Integration Points

  • • GLMP specialized collections
  • • CopernicusAI knowledge graph
  • • Research papers database
  • • API endpoints for programmatic access

✅ Validation & Accuracy

🔍 Quality Assurance Process

  • Automated Validation: All flowcharts validated for Mermaid syntax correctness before publication
  • Metadata Quality Checks: JSON schema validation ensures >=85% metadata completeness (NSF standard)
  • Source Citation Verification: All processes include verified research paper citations with DOI/PubMed links
  • Cross-Reference Validation: Automated checks ensure discipline links and back-references are correct
  • Color Scheme Consistency: All processes follow standardized 5-color scheme for visual consistency

📊 Scale & Coverage

  • 314 Processes Validated: Successfully applied across 6 discipline databases (Biology, Chemistry, Physics, CS, Mathematics, GLMP)
  • Multi-Domain Testing: Framework validated on biological pathways, chemical reactions, computational algorithms, and mathematical proofs
  • Iterative Refinement: Processes refined through multiple LLM analysis cycles to improve accuracy
  • User Feedback Integration: Community feedback mechanism enables continuous improvement (see "Improve this process" on each flowchart)
  • Expert Validation: GLMP processes validated against established biochemical pathway databases

🎯 Accuracy Measures

Syntax Accuracy

100% of published flowcharts render without Mermaid syntax errors

Metadata Completeness

>=85% average quality score across all processes (exceeds NSF requirements)

Source Coverage

All processes include 1-3 verified research paper citations with accessible links

⚠️ Known Limitations

  • LLM-Dependent Accuracy: Flowchart accuracy depends on LLM interpretation of source material; complex processes may require multiple refinement cycles
  • Domain Expertise Required: While the Framework is domain-agnostic, optimal results benefit from domain-specific knowledge for validation
  • Source Material Quality: Accuracy is limited by the quality and completeness of input source material
  • Continuous Improvement: Framework is actively refined based on user feedback and validation results

🔗 Related Projects

🧬 GLMP - Genome Logic Modeling

Regulatory “algorithms” for microbial circuits—indexed in the GLMP database table, with interactive viewers on the GLMP Space.

Explore GLMP → (opens in new tab)

🔬 CopernicusAI

Knowledge engine integrating the Programming Framework with AI podcasts, research papers, and knowledge graph for scientific discovery.

Visit CopernicusAI → (opens in new tab)

How to Cite This Work

Welz, G. (2024–2025). The Programming Framework: A Universal Method for Process Analysis.
Hugging Face Spaces. https://huggingface.co/spaces/garywelz/programming_framework (opens in new tab)

BibTeX Format:

@misc{welz2025programmingframework,
  title={The Programming Framework: A Universal Method for Process Analysis},
  author={Welz, Gary},
  year={2024--2025},
  url={https://huggingface.co/spaces/garywelz/programming_framework},
  note={Hugging Face Spaces}
}

Welz, G. (2024). From Inspiration to AI: Biology as Visual Programming.
Medium. https://medium.com/@garywelz_47126/from-inspiration-to-ai-biology-as-visual-programming-520ee523029a (opens in new tab)

This project serves as a foundational meta-tool for AI-assisted process analysis, enabling systematic extraction and visualization of complex logic from textual sources across diverse scientific and technical domains.

The Programming Framework is designed as infrastructure for AI-assisted science, providing a universal methodology that can be specialized for domain-specific applications.