Dexterous VLA utilizing human ego data training
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models
OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale
RoboBrain 2.5: Depth in Sight, Time in Mind.
URSA: Uniform Discrete Diffusion with Metric Path for Video Generation
-
URSA-1.7B-FSQ320
🎞22URSA Text-to-Image-to-Video
-
Uniform Discrete Diffusion with Metric Path for Video Generation
Paper • 2510.24717 • Published • 44 -
UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models
Paper • 2604.18518 • Published • 5 -
BAAI/URSA-0.6B-IBQ1024
Text-to-Image • Updated • 519 • 6
RoboBrain 2.0: See Better. Think Harder. Do Smarter.
Scaling Instruction Selection and Synthesis to Enhance Language Models
-
BAAI/Infinity-Instruct
Viewer • Updated • 21.9M • 4.45k • 711 -
BAAI/Gemma2-9B-IT-Simpo-Infinity-Preference
9B • Updated • 40 • 17 -
BAAI/Infinity-Instruct-7M-Gen-Llama3_1-70B
Text Generation • 71B • Updated • 394 • • 19 -
BAAI/Infinity-Instruct-3M-0625-Yi-1.5-9B
Text Generation • 9B • Updated • 7.89k • 3
NOVA: Autoregressive Video Generation without Vector Quantization
多语种多行业指令数据集
Tactile Hierarchical Dynamic Dataset
Native Multimodal Models are World Learners 🌍
Efficient MLLM for Long Video Understanding.
A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models
open-source community driven next generation of AI models
Emu3: Next-Token Prediction is All You Need
Chinese Corpora Internet(中文互联网语料)
Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
-
BAAI/Infinity-MM
Updated • 41k • 120 -
BAAI/Aquila-VL-2B-llava-qwen
Visual Question Answering • Updated • 47 • 62 -
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Paper • 2410.18558 • Published • 19 -
BAAI/Aquila-VL-2B-Intermediate
Image-Text-to-Text • Updated • 3
Alt
-
BAAI/AltCLIP
Zero-Shot Image Classification • Updated • 98.5k • 31 -
BAAI/AltCLIP-m18
Zero-Shot Image Classification • Updated • 99 • 5 -
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities
Paper • 2211.06679 • Published • 2 -
AltDiffusion: A Multilingual Text-to-Image Diffusion Model
Paper • 2308.09991 • Published • 3
多语种多行业预训练数据集
Dexterous VLA utilizing human ego data training
Tactile Hierarchical Dynamic Dataset
RoboBrain 2.5: Depth in Sight, Time in Mind.
Native Multimodal Models are World Learners 🌍
URSA: Uniform Discrete Diffusion with Metric Path for Video Generation
-
URSA-1.7B-FSQ320
🎞22URSA Text-to-Image-to-Video
-
Uniform Discrete Diffusion with Metric Path for Video Generation
Paper • 2510.24717 • Published • 44 -
UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models
Paper • 2604.18518 • Published • 5 -
BAAI/URSA-0.6B-IBQ1024
Text-to-Image • Updated • 519 • 6
RoboBrain 2.0: See Better. Think Harder. Do Smarter.
Efficient MLLM for Long Video Understanding.
A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models
open-source community driven next generation of AI models
Emu3: Next-Token Prediction is All You Need
Chinese Corpora Internet(中文互联网语料)
Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
-
BAAI/Infinity-MM
Updated • 41k • 120 -
BAAI/Aquila-VL-2B-llava-qwen
Visual Question Answering • Updated • 47 • 62 -
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Paper • 2410.18558 • Published • 19 -
BAAI/Aquila-VL-2B-Intermediate
Image-Text-to-Text • Updated • 3
Scaling Instruction Selection and Synthesis to Enhance Language Models
-
BAAI/Infinity-Instruct
Viewer • Updated • 21.9M • 4.45k • 711 -
BAAI/Gemma2-9B-IT-Simpo-Infinity-Preference
9B • Updated • 40 • 17 -
BAAI/Infinity-Instruct-7M-Gen-Llama3_1-70B
Text Generation • 71B • Updated • 394 • • 19 -
BAAI/Infinity-Instruct-3M-0625-Yi-1.5-9B
Text Generation • 9B • Updated • 7.89k • 3
NOVA: Autoregressive Video Generation without Vector Quantization
Alt
-
BAAI/AltCLIP
Zero-Shot Image Classification • Updated • 98.5k • 31 -
BAAI/AltCLIP-m18
Zero-Shot Image Classification • Updated • 99 • 5 -
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities
Paper • 2211.06679 • Published • 2 -
AltDiffusion: A Multilingual Text-to-Image Diffusion Model
Paper • 2308.09991 • Published • 3
多语种多行业指令数据集
多语种多行业预训练数据集