Zhenxing Mi's picture

Zhenxing Mi

Mifucius

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

authored a paper 11 days ago

Generalized Binary Search Network for Highly-Efficient Multi-View Stereo

authored a paper 11 days ago

One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control

View all activity

Organizations

None yet

upvoted a paper 3 days ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 11 days ago • 95

upvoted a paper 12 days ago

One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control

Paper • 2511.18922 • Published 13 days ago • 10

upvoted a paper about 2 months ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17 • 89

upvoted a paper 5 months ago

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

Paper • 2504.02542 • Published Apr 3 • 51

upvoted 2 papers 6 months ago

Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation

Paper • 2506.09350 • Published Jun 11 • 48

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 142

upvoted a paper 8 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 93

upvoted 4 papers 9 months ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26 • 72

Personalize Anything for Free with Diffusion Transformer

Paper • 2503.12590 • Published Mar 16 • 44

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

Paper • 2503.14487 • Published Mar 18 • 27

BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing

Paper • 2503.13434 • Published Mar 17 • 27

upvoted 5 papers 10 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 155

Dynamic Concepts Personalization from Single Videos

Paper • 2502.14844 • Published Feb 20 • 16

RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning

Paper • 2502.13144 • Published Feb 18 • 38

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 211

I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models

Paper • 2502.10458 • Published Feb 12 • 38

upvoted a paper 12 months ago

BrushEdit: All-In-One Image Inpainting and Editing

Paper • 2412.10316 • Published Dec 13, 2024 • 35

upvoted 3 papers about 1 year ago

MM-Ego: Towards Building Egocentric Multimodal LLMs

Paper • 2410.07177 • Published Oct 9, 2024 • 22

Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 70

3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection

Paper • 2410.01647 • Published Oct 2, 2024 • 31