BlueCodec β€” speech autoencoder (codec only)

This repository publishes only the neural audio codec used by BlueTTS: a 44.1 kHz speech autoencoder that maps waveforms to a low-rate continuous latent sequence and back. It is not a full TTS model (no text encoder, duration model, or flow stack).

If you need… Use
End-to-end ONNX TTS notmax123/blue-onnx + BlueTTS
Full PyTorch stack + stats (training / voice export) notmax123/blue β€” includes blue_codec.safetensors alongside TTL/DP weights
Training the codec from scratch maxmelichov/blue-codec (standalone repo & training doc)

Project home: https://github.com/maxmelichov/BlueTTS Β· Live demo: Hugging Face Space β€” notmax123/Blue

What it does

  • Encoder: waveform β†’ spectrogram features β†’ 24-dimensional latents at ~86 Hz (compact trajectory for downstream TTS).
  • Decoder: latents β†’ high-quality 44.1 kHz audio (causal stack + vocoder head).

Downstream BlueTTS modules (flow matching, duration, text-to-latent) run in this latent space; keeping synthesis lightweight and fast.

Architecture (summary)

Piece Details
Input 1253-channel spectrogram (1025 log-linear + 228 log-mel; FFT 2048, hop 512)
Encoder (~25.6M params) Conv1d stem (1253β†’512) + 10 ConvNeXt blocks + projection (512β†’24)
Decoder (~25.3M params) CausalConv1d stem (24β†’512) + 10 causal dilated ConvNeXt blocks + vocoder head
Latent 24-D @ ~86 Hz

Checkpoint in this repo

File Role
model.safetensors Encoder + decoder weights (Safetensors). State dict keys are typically prefixed with encoder.* and decoder.*.

*(An older naming convention in some local scripts is ae_latest.safetensors; the file served from this Hub repo is model.safetensors.)*

Download

hf download notmax123/blue-codec --repo-type model --local-dir ./blue_codec_only

Equivalent:

huggingface-cli download notmax123/blue-codec --repo-type model --local-dir ./blue_codec_only

Repo id is case-sensitive: notmax123/blue-codec.

License

MIT β€” align usage with BlueTTS and the blue-codec repository for any training or redistribution terms that apply to your use case.

Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
50.9M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support