Frames, chunks, fps
Every spec in the catalog publishes its own wire shape: resolution,
chunk size, frame rate, default rollout length. The SDK reads these
live from model.spec.arch so your code doesn't have to hard-code
anything spec-specific.
This page walks through the values — using the current active spec (DreamDojo · GR-1) as the concrete reference — and the engine's chunk-alignment rule that applies to every spec.
480×640 resolution
The spec config encodes resolution as [H, W] — height first, matching
the engine's internal convention. The catalog JSON, the SDK
(model.resolution), and the engine all agree on (480, 640). Don't
swap to (640, 480); start_frame arrays of the wrong shape get
rejected at the boundary.
48 frames
Internally, the runner generates frames in chunks of 12. The canonical rollout for GR-1 is 4 chunks = 48 frames.
You'll occasionally see "49-frame rollouts" in older marketing copy
(including the catalog's bench_wall_s line). That's wrong: the
regression suite — which is what's actually been benchmarked — runs
48-frame rollouts. Your synthetic example should use 48 too;
dream.examples.dreamdojo_grasp() does.
Why not 49?
49 frames = 4 full chunks (48) + 1 trailing frame. Pre-fix, the
trailing partial chunk crashed the WAN2.1 tokenizer's time_conv
because the cached + new time-dim collapsed to 2 (kernel size is 3).
The engine now pads trailing partial chunks transparently, but the
SDK still validates chunk-alignment client-side and raises
dream.InputValidationError to keep the wire honest and catch
mistakes before paying for the round-trip.
Bottom line: always pass T as a multiple of model.chunk_size.
The SDK won't let you do otherwise.
10 fps
The mp4 output is encoded at 10 fps, which means a 48-frame rollout plays back as a 4.8-second video. That's the cadence the underlying GR00T teleop dataset was recorded at, and the model preserves it.
If you decode rollout.frames (numpy ndarray, shape (48, 480, 640, 3) uint8), there's no fps metadata — those are just frames. The fps
matters only when re-encoding for playback.
Other specs
The catalog endpoint exposes per-spec values:
model = client.models.get("dreamdojo-2b-gr1")print(model.action_dim) # 384print(model.resolution) # (480, 640)print(model.chunk_size) # 12print(model.spec.arch.default_num_steps) # 35When new specs ship (e.g. dreamdojo-2b-yam, dreamdojo-14b-gr1),
they may change resolution, chunk_size, or action_dim. The SDK
hydrates whatever the catalog returns; your code reads the live values
rather than hard-coding assumptions.