justachetan/flat-pack-bench
Viewer • Updated • 602 • 1.75k
Flat-Pack Bench: Evaluating Spatio-Temporal Understanding in Large Vision-Language Models through Furniture Assembly (CVPR 2026)
Note Manually-curated raw annotations used for constructing the evaluation data used for benchmarking
Note Processed data used for actual benchmarking, along with model responses for the experiments in the paper
Note Miscellaneous files used for or produced during experiments, like question annotations with scrambled part IDs for furniture parts, densified segmentation maps used for agent experiments, and traces produced by the agent during execution