CheXmix: Unified Generative Pretraining for Vision Language Models in Medical Imaging
Paper β’ 2604.22989 β’ Published
How to use stanfordmimi/CheXmix with CheXmix:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
CheXmix is a unified early-fusion generative model trained on a large corpus of chest X-rays paired with radiology reports. The huggingface repository here provides the model weights and an example file (CVPR findings 2026).
For an editable installation, use the following commands to clone and install this repository.
git clone https://github.com/StanfordMIMI/CheXmix.git
cd CheXmix
pip install -e .
For usage instructions, please visit the github repository.
.
βββ README.md
βββ model.safetensors <CheXmix (S1 + S2) checkpoint>
βββ vqgan.ckpt <Image Tokenizer checkpoint>
If you find this repository useful for your work, please cite the cite the paper:
@inproceedings{kumar2026chexmix,
author = {Kumar, Ashwin and Holland, Robbie and Barrett, Corey and Kim, Jangwon and Varma, Maya and Chen, Zhihong and Gao, Yunhe and Zaharchuk, Greg and Taghavi, Tara and Kenthapadi, Krishnaram and Chaudhari, Akshay},
title = {CheXmix: Unified Generative Pretraining for Vision Language Models in Medical Imaging},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings},
pages = {9466--9476},
year = {2026},
note = {arXiv preprint arXiv:2604.22989}
}