Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
wo-datacraft 's Collections
Audio Generation
3D Generation
Any-to-Any
Image Classification
Image Generation
Speech Generation
Speech Recognition
Text Generation - General
Text Generation - Reasoning
Text Generation - Vision
Toolkit - AI Papers
Toolkit - Embeddings
Toolkit - Prompting Papers
Toolkit - Segmentation
Toolkit - Utilities
Video Generation

Text Generation - Vision

updated Apr 17
Upvote
-

  • google/gemma-4-31B-it

    Image-Text-to-Text • 33B • Updated 4 days ago • 10.2M • • 2.72k

  • google/gemma-4-26B-A4B-it

    Image-Text-to-Text • 27B • Updated 4 days ago • 9.45M • • 985

  • microsoft/Phi-4-reasoning-vision-15B

    Image-Text-to-Text • 15B • Updated Mar 18 • 157k • 169

  • mistralai/Ministral-3-14B-Instruct-2512

    Updated Jan 15 • 97.9k • 287

  • moonshotai/Kimi-VL-A3B-Thinking-2506

    Image-Text-to-Text • 16B • Updated Jan 30 • 10.2k • 360

  • Qwen/Qwen3.5-9B

    Image-Text-to-Text • 10B • Updated Mar 2 • 8.01M • • 1.47k

  • Qwen/Qwen3.5-27B

    Image-Text-to-Text • 28B • Updated 28 days ago • 3.34M • • 975

  • Qwen/Qwen3.6-35B-A3B

    Image-Text-to-Text • 36B • Updated 28 days ago • 5.9M • • 1.85k

  • zai-org/GLM-OCR

    Image-Text-to-Text • 1B • Updated 3 days ago • 6.72M • • 1.76k
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs