Files:

Loras/Mugen_NoobFlux2RF_Test-000003.safetensors

Produces different results, normally worse, trained for Mugen with an auxillary training objective from https://arxiv.org/abs/2411.04873.

I decode both the ground truth latent and predicited clean latent, and take the first two up_blocks of the VAE when decoding both the ground truth latent and predicited latent, then compare them against each other using L2 loss and add that back onto the regular flow matching loss at a weight of 0.1

(So loss_total = flow matching + 0.1 * latent_perceptual_loss). I only did this if the timestep was less than 50% (sigmas < 0.5).

Loras/RF-Flux2VAE-Consistency-Test-000002.safetensors

Trained by generating one forward pass from 20-30 timesteps before the target timestep with no gradients. Then simulating one euler step to the target timestep and using the resulting latent as the input to the model and training on that. So the model is trained on clean latent + noise + discretization error + mispredicition error from previous step

My thinking was at inference time the model doesn't only receive the clean latent + gaussian noise like in training, but also + discretization error + mispredicition error from previous step

Works on Mugen too, but trained on NoobAI-Flux2VAE-RectifiedFlow-0.3

The one below performs better