πΌ DaisyChain β Genomics
A modular genomic mind: four dense ~74M DNA/RNA specialists (β295M params total, under Carbon-500M) behind a learned router. Instead of one monolithic foundation model, DaisyChain trains one crisp specialist per biological domain β each distilled per-domain from Carbon-500M β and routes each sequence to its home specialist.
| Specialist | Domain | Params |
|---|---|---|
eukaryote |
Eukaryotic genomic DNA | ~74M |
prokaryote |
Bacterial / prokaryotic DNA | ~74M |
mrna |
Mature mRNA (coding transcript) | ~74M |
mrna_splice |
Pre-mRNA / splice-site regions | ~74M |
The router
A small learned router reads each specialist's surprise (bits/base) and a PCA of its hidden state, then predicts the home domain β recovering the bias-corrections a plain argmin-perplexity rule misses. Held-out routing accuracy: 94.8% (vs 87.5% argmin). Only one ~74M specialist runs per query, so inference is ~7Γ cheaper per token than the 500M monolith.
How each specialist is built
Interleaved continued pretraining (next-token CE on its domain) and offline knowledge distillation from Carbon-500M (soft-target + a factorized per-nucleotide variant via Carbon's FNS branch) β i.e. cBTM-style domain experts, iterated per expert.
Capability vs Carbon-500M (the fair baseline)
| metric | DaisyChain | Carbon-500M |
|---|---|---|
| likelihood (bits/base, β) | 1.86 | 1.75 |
| seq-recovery eukaryote (β) | 31.8% | 42.2% |
| seq-recovery bacteria (β) | 34.0% | 49.5% |
Behind a 500M/1T-token monolith but within striking distance at ~15% of the active compute β and the gap keeps closing with more per-domain training (work in progress).
Usage
from daisychain import DaisyChain
dc = DaisyChain(root=".", device="cpu")
home, bits_per_base = dc.route("ACGTACGT...") # which domain?
print(home, bits_per_base)
print(dc.generate(home, length=180)) # sample from the home specialist
Files: daisychain.py (inference), model.py / specialist_presets.py /
spike_tokenizer.py / registry.py (architecture), tokenizer.json,
<domain>/model.safetensors (the 4 specialists), router2.pt (router).
Interactive demo: the DaisyChain Space routes DNA in real time.
Citation
If you use these models, please cite the author β Dean Byrne (Quazim0t0):
@misc{byrne2026daisychain,
title = {DaisyChain Genomics: A Modular Mixture of Per-Domain Distilled Genomic Specialists},
author = {Byrne, Dean},
year = {2026},
howpublished = {\url{https://huggingface.co/DaisyChainAI/daisychain-genomics}},
note = {DaisyChainAI (Quazim0t0). Four ~74M DNA/RNA specialists distilled per-domain
from Carbon-500M behind a learned router}
}
Built on
DaisyChain stands on these works:
@misc{carbon2025,
title = {Carbon: Genomic Foundation Models},
author = {{HuggingFaceBio}},
year = {2025},
howpublished = {\url{https://huggingface.co/HuggingFaceBio/Carbon-500M}}
}
@article{li2022branchtrainmerge,
title = {Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models},
author = {Li, Margaret and Gururangan, Suchin and Dettmers, Tim and Lewis, Mike and
Althoff, Tim and Smith, Noah A. and Zettlemoyer, Luke},
journal = {arXiv preprint arXiv:2208.03306},
year = {2022}
}
@article{gururangan2023cbtm,
title = {Scaling Expert Language Models with Unsupervised Domain Discovery},
author = {Gururangan, Suchin and Li, Margaret and Lewis, Mike and Shi, Weijia and
Althoff, Tim and Smith, Noah A. and Zettlemoyer, Luke},
journal = {arXiv preprint arXiv:2303.14177},
year = {2023}
}
@article{sukhbaatar2024btx,
title = {Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM},
author = {Sukhbaatar, Sainbayar and Golovneva, Olga and Sharma, Vasu and Xu, Hu and
Lin, Xi Victoria and Roziere, Baptiste and Kahn, Jacob and Li, Daniel and
Yih, Wen-tau and Weston, Jason and Li, Xian},
journal = {arXiv preprint arXiv:2403.07816},
year = {2024}
}
@article{hinton2015distilling,
title = {Distilling the Knowledge in a Neural Network},
author = {Hinton, Geoffrey and Vinyals, Oriol and Dean, Jeff},
journal = {arXiv preprint arXiv:1503.02531},
year = {2015}
}
@inproceedings{furlanello2018born,
title = {Born-Again Neural Networks},
author = {Furlanello, Tommaso and Lipton, Zachary C. and Tschannen, Michael and
Itti, Laurent and Anandkumar, Anima},
booktitle = {ICML},
year = {2018}
}
@inproceedings{gururangan2020dapt,
title = {Don't Stop Pretraining: Adapt Language Models to Domains and Tasks},
author = {Gururangan, Suchin and Marasovi{\'c}, Ana and Swayamdipta, Swabha and
Lo, Kyle and Beltagy, Iz and Downey, Doug and Smith, Noah A.},
booktitle = {ACL},
year = {2020}
}