Model Card for ProtGPT3-1OB

Model Description

ProtGPT3-10B is a single-sequence autoregressive protein language model for protein sequence generation. It is the largest model in the ProtGPT3 family, an open-source suite of promptable and aligned protein language models ranging from 112M to 10B parameters. ProtGPT3 models use a causal Mixtral-style Mixture-of-Experts architecture and are trained for causal language modeling on protein sequences.

For more info and guidance on how to generate sequences with ProtGPT3-10B check out the extensive description provided in ProtGPT3-1.3B, just replacing the model name (i.e., model_name=AI4PD/ProtGPT3-10B).

Also consider using the ProtGPT3-10B-dpo version for an equivalent model size, but with improved sequence generation.

Out-of-Scope Use

The model should not be used as the sole basis for experimental, clinical, environmental, or safety-critical decisions. Generated proteins require downstream computational and experimental validation. The model is not guaranteed to generate functional, soluble, safe, or synthesizable proteins.

Bias, Risks, and Limitations

ProtGPT3-1OB learns from public protein sequence datasets and may reproduce biases present in those datasets. Generated sequences may be low-complexity, nonfunctional, unstable, insoluble, or biologically implausible. Protein generation models may also present dual-use risks if used irresponsibly.

Citation

BibTeX:

@article{protgpt3,
  title={ProtGPT3: an Open-source family of Promptable and Aligned Protein Language Models},
  author={Anonymous Authors},
  year={2026}
}

More Information

For guidance on how to generate sequences with ProtGPT3-10B check out the extensive description provided in ProtGPT3-1.3B. All models and code are released through the Hugging Face ecosystem and accompanying code repository.

Downloads last month
107
Safetensors
Model size
10B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including AI4PD/ProtGPT3-10B