kinder-mbrl β€” Learned World Models for KinDER

Trained world-model checkpoints for the KinDER physical-reasoning benchmark (RSS 2026).

These checkpoints are produced by the kinder-mbrl baseline repository and are intended to be used with its random-shooting MPC planner.


Model architecture

Each checkpoint (.pt) contains a two-head MLPDynamics network together with all normalizer statistics needed for inference.

Network topology

input: [s_t (state_dim) | a_t (action_dim)]
           β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
    β”‚  Linear β†’ SiLU   β”‚  hidden_dim = 256
    β”‚  Linear β†’ SiLU   β”‚  hidden_dim = 256
    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
           β”‚ shared trunk
     β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
     β–Ό            β–Ό
robot_head      env_head
(robot_dim)    (env_dim)
delta_robot    delta_env

Prediction rule:

delta_robot, delta_env = model(normalize(s_t), normalize(a_t))
s_{t+1} = s_t + concat(denormalize(delta_robot), denormalize(delta_env))

The robot-state and environment-state deltas are predicted by two independent linear heads sharing the same trunk. Each head has its own per-feature zero-mean unit-variance normalizer fitted on the training dataset, so the two dynamics regimes (actuator / kinematic changes vs. object / scene changes) are learned independently.

Normalizers

Four independent normalizers are stored in each checkpoint:

Key Applied to
s_norm full state vector
a_norm action vector
dr_norm robot-state delta
de_norm environment-state delta

Checkpoint format

Each .pt file is a torch.save dict with the following keys:

Key Type Description
model_state OrderedDict PyTorch state_dict for MLPDynamics
state_dim int Full state dimension (robot_dim + env_dim)
action_dim int Action dimension
robot_dim int Robot-state sub-dimension
env_dim int Environment-state sub-dimension
s_norm dict with mean, std State normalizer statistics
a_norm dict with mean, std Action normalizer statistics
dr_norm dict with mean, std Robot-delta normalizer statistics
de_norm dict with mean, std Env-delta normalizer statistics

Training

Models were trained from HDF5 demonstration datasets available at kinder-bench/kinder-datasets.

# Install the training package
git clone https://github.com/Princeton-Robot-Planning-and-Learning/kinder-baselines
cd kinder-baselines/kinder-mbrl
uv pip install -e ".[develop]"

# Train (1 000 epochs, batch 512, lr 1e-3 by default)
python experiments/train_world_model.py \
    --mode train \
    --hdf5_path /path/to/dataset.hdf5 \
    --output_dir output \
    --epochs 1000

# Evaluate open-loop rollout error
python experiments/train_world_model.py \
    --mode eval \
    --hdf5_path /path/to/dataset.hdf5 \
    --checkpoint output/wm.pt

Default hyperparameters:

Hyperparameter Value
Hidden dim 256
Activation SiLU
Optimizer Adam
Learning rate 1e-3
Batch size 512
Epochs 1 000
Loss MSE (robot head) + MSE (env head)

Inference / MPC usage

import torch
from kinder_mbrl.planning import load_world_model, wm_get_next_state

# Load checkpoint
model, norms = load_world_model("path/to/wm.pt")

# One-step prediction
next_state = wm_get_next_state(current_state, action, model, norms)

To run the full random-shooting MPC loop:

# World-model-based planning
python experiments/run_mpc.py \
    --use_world_model \
    --checkpoint output/wm.pt \
    --num_candidates 50 \
    --horizon 5

Citation

If you use these datasets, please cite the paper: KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning:

@inproceedings{huang2026kinder,
  title     = {KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning},
  author    = {Huang, Yixuan and Li, Bowen and Saxena, Vaibhav and Liang, Yichao and Mishra, Utkarsh and Ji, Liang and Zha, Lihan and Wu, Jimmy and Kumar, Nishanth and Scherer, Sebastian and Xu, Danfei and Silver, Tom},
  booktitle = {Robotics: Science and Systems (RSS)},
  year      = {2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Paper for kinder-bench/kinder-mbrl-checkpoints