Updated 2026-05-06: now uses
client.predict_many(shipping in dream-engine 0.3.0).
Data augmentation
Bootstrap a policy-training dataset by rolling out a world model from
real teleop start frames + perturbed action sequences. Bit-stable on
Hopper means the augmented set is reproducible — the same (seed, start_frame, actions) always emits the same mp4.
When this is useful
- Your real-teleop dataset is small (<100 episodes) and you need 10× more for downstream policy training.
- You have hard-to-reach states in the original demos (rare grasps, failure recoveries) and want to synthesize variations.
- You want to evaluate a candidate policy against reproducible trajectories before committing to a real-robot trial.
The pattern
client.predict_many handles the iteration loop, concurrency, and
retry for you. Point it at a source of episodes and a sink for the
output videos:
import os, numpy as np, dreamfrom pathlib import Pathfrom io import BytesIOfrom PIL import Imagefrom dream.io import IterableSource, SourceRow, RolloutSink class PerturbedEpisodeSource(IterableSource): """ Loads real teleop episodes from disk, emits K perturbed variants of each as separate rows. """ def __init__(self, episode_paths: list[Path], k_perturbs: int = 4): self._paths = episode_paths self._k = k_perturbs self.rows_hint = len(episode_paths) * k_perturbs def __iter__(self): for ep_path in self._paths: ep = np.load(ep_path) start = ep["start_frame"] # (480, 640, 3) uint8 base = ep["actions"] # (48, 384) float32 for i in range(self._k): noise = (np.random.RandomState(i) .randn(*base.shape) .astype(np.float32) * 0.05) actions = (base + noise).astype(np.float32) # encode frame to PNG bytes buf = BytesIO() Image.fromarray(start).save(buf, format="PNG") # encode actions to .npy bytes act_buf = BytesIO() np.save(act_buf, actions) yield SourceRow( row_id=f"{ep_path.stem}_aug{i:02d}", frame_bytes=buf.getvalue(), actions_bytes=act_buf.getvalue(), metadata={"seed": i, "source_episode": ep_path.stem}, ) REAL_EPISODES = list(Path("/data/gr1_teleop").glob("ep_*.npz")) client = dream.Client()src = PerturbedEpisodeSource(REAL_EPISODES, k_perturbs=4)sink = RolloutSink.dir("/data/augmented") # Optional: check cost before running.estimate = client.estimate_cost(src, spec="dreamdojo-2b-gr1")print(f"≈ ${estimate.total_usd:.2f} for {estimate.rows} rows") result = client.predict_many( src, sink, spec="dreamdojo-2b-gr1", concurrency=8, # 8 concurrent requests; Modal autoscales on_error="skip", progress=True,)print(f"{result.ok} ok / {result.failed} failed → {result.output_uri}")# /data/augmented/videos/ep_001_aug00.mp4 … ep_100_aug03.mp4# /data/augmented/metadata.parquetFor 100 real episodes × 4 perturbations at $0.0245 per rollout (the GR-1 per-rollout charge — 49 frames × $0.0005):
100 × 4 × $0.0245 = $9.80See the bulk inference quickstart
for a simpler walk-through using the public kingJulio/dream-engine-example-frames
fixture.
Avoid predict_batch here
predict_batch shares one start frame across K rollouts. For
augmentation you typically want K different start frames (one per
real episode), each with a few perturbed action sequences. The
predict_many pattern above handles this correctly — each
PerturbedEpisodeSource row is an independent request.
If you do want K perturbations of the same starting frame (e.g. for
ablation experiments or contrastive losses), predict_batch is
cheaper.
Quality control
The augmented frames look reasonable in-distribution but degrade as your perturbation magnitude grows. Two checks:
- Visual sanity — eyeball every 50th rollout. If the GR-1 suddenly teleports or the mug deforms, you've gone OOD.
- PSNR drift — score augmented rollouts against their unperturbed parent. The engine ships at PSNR 22.35 dB on canonical inputs; augmented rollouts at perturb=0.05 typically come in within 1 dB. Past 2-3 dB drop, the synthetic data is noticeably hallucinatory and shouldn't be used for training.
Reproducibility receipt
Save the inputs alongside the outputs:
np.savez(f"/data/augmented/{stem}_aug{i:02d}.npz", seed=i, start_frame=start, actions=actions, request_id=rollout.request_id, engine_wall_ms=rollout.engine_wall_ms, cost_usd=rollout.cost_usd)Six months later, you can re-run any single augmented episode with the
same seed and confirm the engine still produces the same bytes. That
matters for reproducibility audits in research papers.