Augusteinia 's Collections
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture,
Training and Dataset
Paper
• 2505.09568
• Published
• 99
Paper
• 2505.09388
• Published
• 337
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Paper
• 2505.11049
• Published
• 61
Emerging Properties in Unified Multimodal Pretraining
Paper
• 2505.14683
• Published
• 133
MMaDA: Multimodal Large Diffusion Language Models
Paper
• 2505.15809
• Published
• 98
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper
• 2505.18129
• Published
• 62
Video World Models with Long-term Spatial Memory
Paper
• 2506.05284
• Published
• 55
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Paper
• 2506.07491
• Published
• 50
Sekai: A Video Dataset towards World Exploration
Paper
• 2506.15675
• Published
• 66