model.predict
Run one rollout against a ModelHandle. Returns a Rollout with lazy
frame decode + cost / wall metadata.
Signature
def predict( self, *, start_frame: np.ndarray | PIL.Image.Image | bytes | str | Path | None = None, actions: np.ndarray | list | bytes | str | Path | None = None, # Phase 0 byte-level surface (kept for power users) frame_bytes: bytes | None = None, actions_bytes: bytes | None = None, frame_path: str | Path | None = None, actions_path: str | Path | None = None, num_steps: int | None = None, # override diffusion step count guidance: float | None = None, # override classifier-free guidance seed: int = 0, # deterministic seed) -> RolloutYou either pass start_frame + actions (recommended) or the
byte-level args. Mixing the two raises dream.InputValidationError.
Inputs — start_frame
| Type | What it does |
|---|---|
np.ndarray (H, W, 3) uint8 | encoded to PNG, sent as multipart |
np.ndarray (3, H, W) uint8 | auto-transposed to (H, W, 3) |
np.ndarray float in [0, 1] | scaled to uint8, then encoded |
PIL.Image.Image | re-encoded to PNG |
bytes | passed through (assumes PNG/JPEG) |
str / Path | read from disk as bytes |
Validation runs at the SDK boundary — wrong shape, wrong dtype, or
missing path raises dream.InputValidationError before the request
hits the network.
Inputs — actions
Required shape: (T, action_dim) float32, where action_dim matches
the model spec (model.action_dim, e.g. 384 for GR-1) and T is a
multiple of model.chunk_size (12 for GR-1; the canonical for
GR-1-class specs is 48 frames = 4 chunks — see
Frames, chunks, fps).
| Type | What it does |
|---|---|
np.ndarray (T, action_dim) | cast to float32, saved as .npy |
np.ndarray (T, action_dim) float64 | silently downcast to float32 |
nested list / tuple | converted via np.asarray |
bytes | passed through (assumes .npy blob) |
str / Path | read from disk |
action_dim mismatch raises dream.InputValidationError with a
specific message ("model expects 384, got array with shape (49, 100)").
The result — Rollout
@dataclassclass Rollout: mp4_bytes: bytes # raw mp4 — always populated request_id: str # server-assigned UUID engine_wall_ms: float # server-side Engine.predict() wall cost_usd: float # frames × tier_price customer_id: str # Stripe customer psnr_db: float | None = None # populated when score=True (future) ssim: float | None = None lpips: float | None = NoneConvenience accessors:
rollout.cost_usd # alias for cost_usdrollout.wall_s # engine_wall_ms / 1000rollout.frames # (T, H, W, 3) uint8 numpy ndarray (decoded lazily)rollout.save("out.mp4") # writes mp4_bytes to disk; returns Pathrollout.frames triggers decode-via-mediapy on first access and caches
the array. If [decode] extra isn't installed, frames returns
None; mp4_bytes is always there.
Examples
Numpy in, mp4 out
import numpy as npimport dream client = dream.Client()model = client.models.get("dreamdojo-2b-gr1") start_frame = np.zeros((480, 640, 3), dtype=np.uint8) # your real startactions = np.load("teleop.npy") # (48, 384) rollout = model.predict(start_frame=start_frame, actions=actions)rollout.save("out.mp4")Override diffusion steps
fast = model.predict( start_frame=img, actions=actions, num_steps=20, # default 35 for DreamDojo; halving roughly halves wall at a small quality cost)From file paths
rollout = model.predict( start_frame="/tmp/start.png", actions="/tmp/teleop.npy",)Cost vs wall
The two clocks are independent:
engine_wall_s≈ 2.6 s for DreamDojo on H100 (warm).- Client-side wall =
engine_wall_s+ transit (~1 s) + cold-start (0 if warm, ~75 s if cold).
You're billed on the server's frame count, not on either wall.
Errors
dream.InputValidationError— wrong shape, dtype, missing path, conflicting args.dream.ModelNotActiveError— handle slug isn't the server's active spec.dream.AuthError,dream.RateLimitError,dream.ModelNotFoundError,dream.ServerError— network-side. See Errors & retries.