YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

RLWF β€” DreamZero checkpoints

Private checkpoint repository for the RLWF paper ("Active Robot Data Collection from World Model Feedback"). Two checkpoints, both stock DreamZero architecture, no architectural modifications β€” only the training data and training-config differ.

Layout

rlwf-ckpt/
β”œβ”€β”€ README.md
β”œβ”€β”€ LICENSE
β”œβ”€β”€ mimicgen-core-14b-lora-step80000/   # LoRA fine-tune, ~217 MB
└── mimicgen-core-14b-full-step46000/   # full fine-tune, 10-shard ~47 GB

What each checkpoint is

mimicgen-core-14b-lora-step80000/

  • Architecture: stock DreamZero (groot.vla.model.dreamzero.base_vla.VLA)
  • Base model: Wan2.1-I2V-14B-480P, frozen
  • Adapter: LoRA, rank 4, target modules q,k,v,o,ffn.0,ffn.2
  • Action head: WAN flow-matching action transformer (groot.vla.model.dreamzero.action_head.wan_flow_matching_action_tf.WANPolicyHead)
  • Action dim: 32 (multi-embodiment), horizon 24
  • Training data: MimicGen expert demos on LIBERO MimicGen-core (12 tasks)
  • Step: 80,000

mimicgen-core-14b-full-step46000/

  • Architecture: same stock DreamZero, no changes
  • Variant: full fine-tune (no LoRA) on 16 GPUs with DeepSpeed ZeRO
  • Sharding: 10-shard safetensors (model-{1..10}-of-00010.safetensors)
  • Training data: same MimicGen-core 12 tasks, longer instruction prompts ("detailed_instruct" recipe)
  • Step: 46,000

How to load

With the DreamZero codebase available:

from stable_worldmodel.wm.utils import load_pretrained
# either subdir works the same way:
model = load_pretrained(
    "MinghaoFu/rlwf-ckpt/mimicgen-core-14b-lora-step80000",
    extra_args={"torch_dtype": "bfloat16"},
)

Direct safetensors load (LoRA, single file):

from safetensors.torch import load_file
state_dict = load_file("model.safetensors")

Direct safetensors load (full, sharded):

import json
from safetensors.torch import load_file

with open("model.safetensors.index.json") as f:
    index = json.load(f)
state_dict = {}
for shard in sorted(set(index["weight_map"].values())):
    state_dict.update(load_file(shard))

Full training config is in experiment_cfg/conf.yaml of each subdir.

What is NOT in this repo

  • DeepSpeed optimizer state (global_step*/) β€” stripped to keep the download small. If you want to resume training instead of just loading for inference, ping me; the optimizer shards are kept separately.
  • rng_state_*.pth β€” same reason.
  • The latest text file β€” points to a path inside global_step*/, irrelevant without the optimizer state.

License

MIT (see LICENSE). The underlying Wan2.1-I2V-14B-480P base model has its own Apache-2.0 license. DreamZero architecture follows the original authors' release terms; this repo only redistributes the fine-tuned weights.

Contact

Minghao Fu β€” isminghaofu@gmail.com

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support