tarn59/book_flatten_and_crop_qwen_image_edit_2509 Image-to-Image โข Updated 28 days ago โข 239 โข โข 37
Running on Zero Featured 155 ReconViaGen ๐ฅ 155 High-fidelity 3D Geometry Generation from multi-view images
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ โข 8 items โข Updated 12 days ago โข 171
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Paper โข 2505.23747 โข Published May 29 โข 68
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper โข 2505.17612 โข Published May 23 โข 81
Runtime error 61 TRELLIS - Multiple Imagen a 3D ๐ 61 Scalable and Versatile 3D Generation from images
docling-project/SmolDocling-256M-preview Image-Text-to-Text โข 0.3B โข Updated Sep 17 โข 102k โข 1.6k
view article Article Llama can now see and run on your device - welcome Llama 3.2 +5 Sep 25, 2024 โข 191
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text โข 11B โข Updated Dec 4, 2024 โข 134k โข โข 1.55k
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 โข 15 items โข Updated Dec 6, 2024 โข 647
Running Featured 1.12k OpenVoice ๐ค 1.12k Generate customized speech from text using a reference audio