DaisyChainAI

community

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Quazim0t0 updated a Space about 1 hour ago

DaisyChainAI/README

Quazim0t0 published a Space about 1 hour ago

DaisyChainAI/README

Quazim0t0 updated a Space about 2 hours ago

DaisyChainAI/Daisychain-Genomics-Demo

View all activity

Organization Card

Community About org cards

🌼 DaisyChainAI

We build capable systems by daisy-chaining a handful of small, sharp specialists behind a learned router — instead of training one giant model to do everything. Each specialist is cheap, swappable, and crisp on its own domain; chained together, they behave like one model at a fraction of the active compute.

🔗 What "daisy-chaining" means

A daisy chain links independent units in series so a signal can flow from one to the next, each unit handling what it's good at and passing the rest along. That's exactly how our systems work:

Each link is one small specialist — a dense ~74M model trained on a single domain. It is excellent at its own data and (deliberately) surprised by everything else.
The router is the connector between links. When an input arrives, it travels down the chain: every specialist reports how surprised it is (bits/base) and exposes its hidden state, and a tiny learned router hands the work to the link that's most at home with it.
The chain grows link by link. Because the specialists are trained separately, you can chain a new domain on without retraining the others — add a link, extend the router, done. Remove or upgrade a single link the same way.
One link runs per query. Only the routed specialist computes, so a chain of four ~74M experts costs ~74M of compute per token — roughly 7× cheaper than a 500M monolith of comparable scope.

So "DaisyChain" is both the brand and the mechanism: a chain of specialists, connected by routing, that you extend one flower at a time.

🛠️ How the models are built

Each specialist is grown by interleaving two steps, per domain:

Continued pretraining — next-token training on only that domain's data, so the specialist becomes genuinely crisp on its home distribution (and the router can tell the links apart).
Per-domain distillation — the specialist is distilled from a larger teacher foundation model restricted to its own domain (soft-target KD, plus a factorized per-nucleotide variant where the teacher supports it). It learns the teacher's behavior on its slice without ever becoming a generic clone — the specialization is what makes routing work.

We iterate those two steps until each link is as strong as its capacity allows, then train the router: a small head that reads every specialist's surprise plus a compressed view of its hidden state and predicts the home domain — recovering bias-corrections a plain "lowest-perplexity-wins" rule misses.

This is, in lineage, a cluster Branch-Train-Merge (cBTM) mixture of domain experts — independent experts + perplexity-aware routing — with iterative distillation from a larger teacher layered on top.

🧬 Current project — DaisyChain Genomics

Four DNA/RNA specialists (eukaryote · prokaryote · mRNA · mRNA-splice, ~74M each, ≈295M total — under 500M), each distilled per-domain from a 500M genomic foundation model, behind a learned router.


Routing accuracy (held-out)	94.8%
Active params / query	~74M (one specialist)
vs the 500M teacher	within ~6% likelihood; closing with training

📦 Model: DaisyChainAI/daisychain-genomics
🎮 Live demo: Daisychain-Genomics-Demo — paste DNA, watch the chain light up specialist-by-specialist and route in real time.

More links on the chain — and more chains — coming. 🌼

Citation

If you use these models, please cite the author — Dean Byrne (Quazim0t0):

@misc{byrne2026daisychain,
  title        = {DaisyChain Genomics: A Modular Mixture of Per-Domain Distilled Genomic Specialists},
  author       = {Byrne, Dean},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/DaisyChainAI/daisychain-genomics}},
  note         = {DaisyChainAI (Quazim0t0). Four ~74M DNA/RNA specialists distilled per-domain
                  from Carbon-500M behind a learned router}
}