WBench Weights

Pre-trained model weights for WBench evaluation.

This repository contains the consolidated model weights for WBench, a comprehensive multi-turn benchmark for interactive video world model evaluation. WBench evaluates world models along five dimensions: video quality, setting adherence, interaction adherence, consistency, and physics compliance. It contains 289 test cases and 1,058 interaction turns covering diverse scenes, styles, subjects, and perspectives.

Usage

Please refer to the WBench GitHub repository for installation and evaluation instructions. You can download the weights using the Hugging Face CLI:

huggingface-cli download meituan-longcat/WBench-weights --local-dir weights/

Disclaimer

We consolidate these weights into a single repository to help the community quickly deploy the WBench evaluation framework without hunting for individual checkpoints. These weights are redistributed solely for academic research and evaluation purposes. All rights belong to the original authors. See LICENSE_NOTICE.md for per-model licenses. If you believe any content infringes your rights, please contact us and we will remove it promptly:

Kaining Ying: kaining.ying.cv@gmail.com
Siyu Ren: rensiyu07@meituan.com

Citation

@article{ying2025wbench,
  title={WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation},
  author={Ying, Kaining and Hu, Hengrui and Ren, Siyu and Li, Jiamu and Chen, Fengjiao and Wang, Ziwen and Cao, Xuezhi and Cai, Xunliang and Ding, Henghui},
  journal={arXiv preprint arXiv:2605.25874},
  year={2025}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including meituan-longcat/WBench-weights

WBench

Collection

WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation • 3 items • Updated 1 day ago • 1

Paper for meituan-longcat/WBench-weights

WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation

Paper • 2605.25874 • Published 4 days ago • 96