Edit-Plus VAE produces blank/gray output — 11 decoder RMS-norm gamma weights are zero in the published checkpoint

#2
by szwagros - opened

Symptom: Every generation with JoyImageEditPlusPipeline decodes to a uniform gray image (pixel std ≈ 0), regardless of prompt/inputs/steps/dtype.

Root cause: In vae/diffusion_pytorch_model.safetensors, 11 decoder *.gamma tensors are all zeros. Since WanRMS_norm computes normalized * scale * gamma + bias, a zero gamma zeroes the decoder features → conv_out emits only its bias → flat gray. The
encoder gammas are intact (~0.85); only the decoder's are zeroed:

decoder.norm_out.gamma
decoder.mid_block.resnets.{0,1}.norm{1,2}.gamma
decoder.up_blocks.0.resnets.{0,1,2}.norm{1,2}.gamma

Not a download issue: the on-disk file sha256 = df8785965b82543033326c9c27e8c088b10b177da59f608225a049d9f24db86a matches the HF LFS hash, and loads with 0 missing/unexpected keys. The zeros are in the published artifact.

Minimal reproduction — a VAE encode→decode round-trip is enough:

z = vae.encode(x).latent_dist.mode() # healthy latent, std ~1
rec = vae.decode(z).sample # flat, std ~0.017 (should reconstruct x)

Confirmation it's a corrupt export: the prior jdopensource/JoyAI-Image-Edit-Diffusers VAE has these gammas intact, identical latents_mean/latents_std, identical decoder key set/shapes (72/106 decoder tensors byte-identical). Since Edit-Plus
reportedly differs only by multi-image support, the VAE should be unchanged. Swapping the whole previous-iteration VAE in fixes it completely (round-trip recon std 0.529 ≈ 0.533 input; generations come out correct). Grafting just the 11 gammas does
not fully fix it — other decoder tensors are corrupted too — so please re-export the entire VAE.

Request: Re-upload vae/diffusion_pytorch_model.safetensors with the intact decoder weights.

Env: diffusers 0.39.0.dev0 (PR #14032 branch), transformers 5.13.0.dev0, torch 2.8.0+cu128, RTX PRO 6000 (Blackwell).

JD.com Open Source org
edited 2 days ago

@szwagros Thank you for the detailed investigation and for raising this issue! Your analysis is thorough and very helpful.

We are currently verifying this on our end. If the decoder weight corruption is confirmed, we will re-upload the correct VAE weights immediately.

We'll follow up here once the fix is in place. Thanks again for your attention to this!

tangyanfei changed discussion status to closed
tangyanfei changed discussion status to open
JD.com Open Source org

@szwagros Thank you again for your thorough investigation and clear write-up — it was extremely helpful in pinpointing the issue quickly.

We have confirmed the problem on our end: the published VAE decoder indeed had 11 zeroed gamma tensors. We have now re-uploaded the correct vae/diffusion_pytorch_model.safetensors with
intact decoder weights. Please re-download and verify on your side.

We also updated inference.py and README.md to reflect a recent API change (_get_bucket_size → vae_image_processor.get_default_height_width).

Apologies for the inconvenience, and thanks again for catching this!

tangyanfei changed discussion status to closed

Sign up or log in to comment