QuantFunc

Logo

๐Ÿค— Hugging Face  |  ๐Ÿค– ModelScope  |  ๐Ÿ’ป GitHub  |  ๐Ÿ’ฌ WeChat (ๅพฎไฟก)  |  ๐ŸŽฎ Discord

Ideogram-4-Series

โš ๏ธ Config-only repository โ€” no model weights. This repo contains only a QuantFunc per-layer precision config (precision-config/ideogram4_a4w4.json). It does not contain, mirror, or redistribute any Ideogram model weights. You bring your own officially-obtained Ideogram 4 model; this config only tells the QuantFunc engine how to quantize it at load time, on your own machine.

Powered by the QuantFunc ComfyUI plugin โ€” the fastest diffusion inference engine:

  • ๐Ÿš€ 2xโ€“11x speedup over standard BF16/FP16 Python pipelines.
  • โš™๏ธ Native C++/CUDA (libquantfunc.so / quantfunc.dll), zero Python model dependencies.
  • ๐Ÿงฉ Universal format adapter โ€” loads diffusers / BFL (Flux) / HF / nunchaku SVDQ layouts directly, no manual conversion.
  • ๐ŸŸข Full GPU coverage โ€” RTX 20/30/40/50 ยท A100/H100/H200/B100/B200/GB300 ยท RTX 6000 Ada / PRO Blackwell (CUDA 12 & 13); native FP4 on Blackwell.

๐Ÿ‘‰ Install the plugin: https://github.com/RealJonathanYip/ComfyUI-QuantFunc

What this repository provides

Just the precision config โ€” no weights:

Ideogram-4-Series/
โ”œโ”€โ”€ config.json                     # canonical per-layer precision map (W4A4)
โ””โ”€โ”€ precision-config/
    โ””โ”€โ”€ ideogram4_a4w4.json         # identical copy, named for manual / plugin use

config.json and precision-config/ideogram4_a4w4.json are identical. Both are the W4A4 precision map โ€” pick whichever your workflow expects.

We deliberately do not host Ideogram 4 weights. The QuantFunc Lighting backend does runtime quantization: you load the official weights and they are quantized in-memory at load, so no pre-quantized checkpoint is ever distributed.

How to use

  1. Obtain the official Ideogram 4 model yourself in any QuantFunc-supported layout (diffusers, BFL/Flux-style, or HF). Follow Ideogram's official distribution channels and license terms.
  2. Install the QuantFunc ComfyUI plugin: https://github.com/RealJonathanYip/ComfyUI-QuantFunc
  3. Load the official model through the Build Pipeline node (universal format adapter).
  4. Precision config โ€” leave the node on auto detect (it recognizes Ideogram 4 and applies ideogram4_a4w4.json automatically), or point it at this file manually. The Lighting engine then runtime-quantizes the transformer to W4A4 (4-bit heavy GEMMs + 8-bit sensitive projections).

Precision config โ€” ideogram4_a4w4.json

Per-layer precision map (mirrors the Klein-style configs). Measured on a dual-transformer 24 GB card (cuda_overhead 399 MB) to fit and render a coherent, prompt-matching image โ€” with sharper detail than FP16-non-block.

Layer group Precision Why
layers.attention.qkv ยท layers.attention.o 4-bit (AUTO_4 โ†’ INT4 on SM89, FP4 on SM120) self-attention projections; large K/N, quant-robust
layers.feed_forward.w1/w2/w3 4-bit SwiGLU MLP โ€” largest matrices, primary memory target
input_proj ยท llm_cond_proj ยท t_embedding.mlp_in/out ยท adaln_proj ยท final_layer.linear 8-bit (AUTO_8 โ†’ FP8 on SM89+, INT8 older; W8A8) sensitive non-block projection GEMMs
layers.adaln_modulation ยท final_layer.adaln_modulation FP16 M=1 modulation GEMVs โ€” per-token activation quant collapses conditioning; engine skips them

Net: 170 block GEMMs @ 4-bit ยท 5 non-block projection GEMMs @ AUTO_8 (FP8 on SM89) ยท 2 adaLN-modulation GEMVs @ FP16.

Verified coherent on SM89 (INT8 dashboard-run + FP8 CLI-run, each cuda_overhead 399 MB). AUTO_8 picks FP8 on SM89 for better dynamic range on these sensitive projections.

Hardware

  • NVIDIA RTX 20-series and above (CUDA 12 & 13). Native FP4 on Blackwell (SM120); INT4 on SM89.
  • Fits a 24 GB card with the a4w4 map (measured cuda_overhead 399 MB).

Legal / Attribution

  • This repository distributes only the QuantFunc precision-config JSON โ€” our own work, Apache-2.0.
  • It contains no Ideogram weights and is not affiliated with, nor endorsed by, Ideogram.
  • "Ideogram" is a trademark of its respective owner. You are solely responsible for obtaining the official model and complying with its license and terms of use.

Community

  • ๐ŸŽฎ Discord server
  • ๐Ÿ’ฌ Scan the QR code below to join our WeChat group:
WeChat Group
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for QuantFunc/Ideogram-4-Series

Quantized
(13)
this model