QuantFunc
๐ค Hugging Face | ๐ค ModelScope | ๐ป GitHub | ๐ฌ WeChat (ๅพฎไฟก) | ๐ฎ Discord
Ideogram-4-Series
โ ๏ธ Config-only repository โ no model weights. This repo contains only a QuantFunc per-layer precision config (
precision-config/ideogram4_a4w4.json). It does not contain, mirror, or redistribute any Ideogram model weights. You bring your own officially-obtained Ideogram 4 model; this config only tells the QuantFunc engine how to quantize it at load time, on your own machine.
Powered by the QuantFunc ComfyUI plugin โ the fastest diffusion inference engine:
- ๐ 2xโ11x speedup over standard BF16/FP16 Python pipelines.
- โ๏ธ Native C++/CUDA (
libquantfunc.so/quantfunc.dll), zero Python model dependencies. - ๐งฉ Universal format adapter โ loads diffusers / BFL (Flux) / HF / nunchaku SVDQ layouts directly, no manual conversion.
- ๐ข Full GPU coverage โ RTX 20/30/40/50 ยท A100/H100/H200/B100/B200/GB300 ยท RTX 6000 Ada / PRO Blackwell (CUDA 12 & 13); native FP4 on Blackwell.
๐ Install the plugin: https://github.com/RealJonathanYip/ComfyUI-QuantFunc
What this repository provides
Just the precision config โ no weights:
Ideogram-4-Series/
โโโ config.json # canonical per-layer precision map (W4A4)
โโโ precision-config/
โโโ ideogram4_a4w4.json # identical copy, named for manual / plugin use
config.jsonandprecision-config/ideogram4_a4w4.jsonare identical. Both are the W4A4 precision map โ pick whichever your workflow expects.
We deliberately do not host Ideogram 4 weights. The QuantFunc Lighting backend does runtime quantization: you load the official weights and they are quantized in-memory at load, so no pre-quantized checkpoint is ever distributed.
How to use
- Obtain the official Ideogram 4 model yourself in any QuantFunc-supported layout (diffusers, BFL/Flux-style, or HF). Follow Ideogram's official distribution channels and license terms.
- Install the QuantFunc ComfyUI plugin: https://github.com/RealJonathanYip/ComfyUI-QuantFunc
- Load the official model through the Build Pipeline node (universal format adapter).
- Precision config โ leave the node on
auto detect(it recognizes Ideogram 4 and appliesideogram4_a4w4.jsonautomatically), or point it at this file manually. The Lighting engine then runtime-quantizes the transformer to W4A4 (4-bit heavy GEMMs + 8-bit sensitive projections).
Precision config โ ideogram4_a4w4.json
Per-layer precision map (mirrors the Klein-style configs). Measured on a dual-transformer 24 GB card (cuda_overhead 399 MB) to fit and render a coherent, prompt-matching image โ with sharper detail than FP16-non-block.
| Layer group | Precision | Why |
|---|---|---|
layers.attention.qkv ยท layers.attention.o |
4-bit (AUTO_4 โ INT4 on SM89, FP4 on SM120) | self-attention projections; large K/N, quant-robust |
layers.feed_forward.w1/w2/w3 |
4-bit | SwiGLU MLP โ largest matrices, primary memory target |
input_proj ยท llm_cond_proj ยท t_embedding.mlp_in/out ยท adaln_proj ยท final_layer.linear |
8-bit (AUTO_8 โ FP8 on SM89+, INT8 older; W8A8) | sensitive non-block projection GEMMs |
layers.adaln_modulation ยท final_layer.adaln_modulation |
FP16 | M=1 modulation GEMVs โ per-token activation quant collapses conditioning; engine skips them |
Net: 170 block GEMMs @ 4-bit ยท 5 non-block projection GEMMs @ AUTO_8 (FP8 on SM89) ยท 2 adaLN-modulation GEMVs @ FP16.
Verified coherent on SM89 (INT8 dashboard-run + FP8 CLI-run, each cuda_overhead 399 MB). AUTO_8 picks FP8 on SM89 for better dynamic range on these sensitive projections.
Hardware
- NVIDIA RTX 20-series and above (CUDA 12 & 13). Native FP4 on Blackwell (SM120); INT4 on SM89.
- Fits a 24 GB card with the a4w4 map (measured
cuda_overhead399 MB).
Legal / Attribution
- This repository distributes only the QuantFunc precision-config JSON โ our own work, Apache-2.0.
- It contains no Ideogram weights and is not affiliated with, nor endorsed by, Ideogram.
- "Ideogram" is a trademark of its respective owner. You are solely responsible for obtaining the official model and complying with its license and terms of use.
Community
- ๐ฎ Discord server
- ๐ฌ Scan the QR code below to join our WeChat group:
- Downloads last month
- -
Model tree for QuantFunc/Ideogram-4-Series
Base model
ideogram-ai/ideogram-4-fp8