Submitted by akhaliq 48 Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models · 8 authors 3.33k 4
Submitted by akhaliq 28 ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion · 6 authors 4
Submitted by akhaliq 25 Garment3DGen: 3D Garment Stylization and Texture Generation · 6 authors 123 3
Submitted by akhaliq 23 BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text · 11 authors 635 3
Submitted by akhaliq 20 Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction · 7 authors 156 2
Submitted by akhaliq 12 EgoLifter: Open-world 3D Segmentation for Egocentric Perception · 6 authors 138 1
Submitted by akhaliq 11 FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing · 4 authors 1
Submitted by akhaliq 6 Towards a World-English Language Model for On-Device Virtual Assistants · 6 authors 1