Instructions to use BiliSakura/MiniT2I-diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/MiniT2I-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/MiniT2I-diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "A lonely astronaut standing on a quiet beach under two moons." image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
BiliSakura/MiniT2I-diffusers
Self-contained MiniT2I text-to-image checkpoints for Hugging Face diffusers. Each variant folder ships its own pipeline code, component modules, bundled FLAN-T5-Large text encoder, and transformer weights.
Converted from MiniT2I/MiniT2I using MiniT2I-diffusers in Visual-Generative-Foundation-Model-Collection.
Available checkpoints
| Subfolder | Model | Params (denoiser + text encoder) | Patch | Recommended CFG |
|---|---|---|---|---|
MiniT2I-B-16/ |
MiniT2I-B/16 | 258M + 341M | 16 | 2.5 |
MiniT2I-L-16/ |
MiniT2I-L/16 | 912M + 341M | 16 | 6.0 |
Repo layout
BiliSakura/MiniT2I-diffusers/
βββ README.md
βββ MiniT2I-B-16/
β βββ pipeline.py
β βββ model_index.json
β βββ conversion_metadata.json
β βββ demo.png
β βββ scheduler/
β β βββ scheduler_config.json
β βββ text_encoder/
β βββ tokenizer/
β βββ transformer/
β βββ config.json
β βββ diffusion_pytorch_model.safetensors
β βββ transformer_minit2i.py
βββ MiniT2I-L-16/
βββ ...
Each variant is self-contained: load with custom_pipeline=.../pipeline.py and trust_remote_code=True. MiniT2I denoises directly in RGB pixel space (no VAE).
Demo
Prompt: "A lonely astronaut standing on a quiet beach under two moons." β MiniT2I-B/16 at 512Γ512, 100 steps, guidance_scale=2.5, seed 42.
Load from Hugging Face
import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(
"BiliSakura/MiniT2I-diffusers/MiniT2I-B-16",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to("cuda")
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
"A lonely astronaut standing on a quiet beach under two moons.",
num_inference_steps=100,
guidance_scale=2.5,
generator=generator,
).images[0]
image.save("demo.png")
For MiniT2I-L/16, use MiniT2I-L-16 and guidance_scale=6.0.
Load from a local clone
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./MiniT2I-B-16").resolve()
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to("cuda")
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
"A lonely astronaut standing on a quiet beach under two moons.",
num_inference_steps=100,
guidance_scale=2.5,
generator=generator,
).images[0]
image.save("demo.png")
Load a variant subfolder (e.g. ./MiniT2I-B-16), not the repo root.
Recommended inference settings
| Variant | Resolution | Steps | CFG scale | torch_dtype |
|---|---|---|---|---|
MiniT2I-B-16 |
512Γ512 | 100 | 2.5 | bfloat16 |
MiniT2I-L-16 |
512Γ512 | 100 | 6.0 | bfloat16 |
For GenEval / DPG-Bench evaluation, upstream configs use guidance_scale=5.0 for both B/16 and L/16.
Interface notes
- Text conditioning uses bundled
google/flan-t5-large(T5EncoderModel+T5Tokenizer). - Scheduler is
FlowMatchEulerDiscreteSchedulerwith 1000 training timesteps andshift=1.0. guidance_scale > 1.0enables classifier-free guidance with an empty-string null prompt.- Output resolution is fixed at 512Γ512 for these exports.
Regenerate bundles
From the repository root:
conda activate rsgen
python scripts/convert_minit2i_to_bilisakura.py
Links
- Blog: MiniT2I: A Minimalist Baseline for Text-to-Image Generation
- Upstream checkpoints: MiniT2I/MiniT2I
- PyTorch/Diffusers source: MiniT2I-diffusers
Citation
@misc{minit2i2026,
title = {MiniT2I: A Minimalist Baseline for Text-to-Image Generation},
author = {Wang, Xianbang and Zhao, Hanhong and Lu, Yiyang and Zhou, Kangyang and Ma, Linrui and He, Kaiming},
year = {2026},
url = {https://peppaking8.github.io/#/post/minit2i}
}
- Downloads last month
- -
