-
UniDDT: Unifying Multimodal Understanding and Generation with Decoupled Diffusion Transformer
Paper • 2606.16255 • Published • 14 -
Qwen-RobotWorld Technical Report: Unifying Embodied World Modeling through Language-Conditioned Video Generation
Paper • 2606.17030 • Published • 29 -
ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?
Paper • 2606.19531 • Published • 17
Muhammadinam
INAM2004
AI & ML interests
None yet
Recent Activity
updated a collection 3 days ago
Research for mutimodle modle updated a collection 3 days ago
Research for mutimodle modleOrganizations
None yet