🌍 A minimalist repository for training video world models based on diffusion-forcing. • 20 items • Updated • 7
NanoWM-B/1 - Web-DINO - DINO-WM / pusht
Encoder-backbone comparison checkpoint for the pusht environment from the DINO-WM suite. This run replaces the SD-VAE latent target with frozen Web-DINO patch features and trains NanoWM-B/1 for 100,000 steps.
This repository contains only the NanoWM transformer weights and training configuration. It does not include Weights & Biases logs or the Web-DINO encoder weights.
Run identity
- collection: https://huggingface.co/collections/knightnemo/nano-world-model
- reference baseline:
knightnemo/nanowm-b2-dino-wm-pusht-100k
Training setup
| Key | Value |
|---|---|
| Architecture | NanoWM-B/1 (~160M params) |
| Latent codec | Web-DINO, 224 input, 14px patches, 16x16x1024 features |
| Dataset | DINO-WM pusht |
| Frames | 4 |
| Context frames | 1 |
| Action injection | additive |
| Steps | 100,000 |
| Effective batch | 64 |
| Optimizer | AdamW, lr 1e-4, wd 0.01 |
| Precision | bf16-mixed, torch.compile on |
| Seed | 3407 |
Diffusion setup
| Key | Value |
|---|---|
| pred_name | v |
| noise_schedule | squaredcos_cap_v2 |
| zero_terminal_snr | true |
| timestep_sampling | logit_normal |
| snr_gamma | 5.0 |
| diffusion_steps | 1000 train, 250 DDIM sample |
Loading
git clone git@github.com:knightnemo/nano-world-model.git
cd nano-world-model
huggingface-cli download knightnemo/nanowm-b1-webdino-dino-wm-pusht-100k --local-dir ./ckpt
import sys
from omegaconf import OmegaConf
from safetensors.torch import load_file
sys.path.insert(0, "src")
from models import get_models
cfg = OmegaConf.load("ckpt/config.yaml")
cfg.experiment.infra.compile = False
model = get_models(cfg).eval()
state_dict = load_file("ckpt/model.safetensors")
model.load_state_dict(state_dict, strict=True)
The config expects a Web-DINO encoder compatible with
facebook/webssl-dino300m-full2b-224 and encoder-only latent metrics. Since
this latent codec has no decoder, pixel video sampling and pixel metrics are
not available from this checkpoint alone.
- Downloads last month
- 1
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
