Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections trending this week

Our most capable model to date, designed for long-horizon work. Apache 2.0.

poolside/Laguna-M.1

Text Generation • 226B • Updated about 9 hours ago • 3.31k • 101
poolside/Laguna-M.1-base

Text Generation • 226B • Updated 13 days ago • 675 • 13
poolside/Laguna-M.1-NVFP4

Text Generation • 131B • Updated about 9 hours ago • 3.13k • 10
poolside/Laguna-M.1-FP8

Text Generation • 226B • Updated about 9 hours ago • 1.8k • 10

nex-agi/Nex-N2-Pro

Text Generation • 397B • Updated 20 days ago • 8.43k • 359
nex-agi/Nex-N2-mini

Text Generation • 35B • Updated 20 days ago • 17k • 265
nex-agi/Nex-N2-Pro-fp8

Text Generation • 397B • Updated 18 days ago • 3.03k • 16

Gemma 4 — DECKARD HERETIC, Multimodal & Speculators

Gemma 4 abliterated/quantized — DECKARD HERETIC 31B, SuperGemma4-26B multimodal, 26B-A4B MoE, plus EAGLE3/DFlash drafters.

AEON-7/Gemma-4-31B-it-DECKARD-HERETIC-Uncensored-NVFP4

Text Generation • 18B • Updated 11 days ago • 3.11k • 10
AEON-7/Gemma-4-31B-it-DECKARD-HERETIC-Uncensored-NVFP4-SVDQuant

Text Generation • 19B • Updated 11 days ago • 846 • 2
AEON-7/Gemma-4-26B-A4B-it-Uncensored-NVFP4

Text Generation • 15B • Updated 11 days ago • 74.3k • 21
AEON-7/Gemma-4-E4B-DECKARD-HERETIC-Uncensored-NVFP4

Text Generation • 6B • Updated 11 days ago • 244 • 1

BidirLM is a family of 5 frontier bidirectional encoders, including an omnimodal variant at 2.5B.

BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs

Paper • 2604.02045 • Published Apr 2 • 39
BidirLM/BidirLM-Omni-2.5B-Embedding

Sentence Similarity • 2B • Updated May 12 • 349 • 45
BidirLM/BidirLM-1.7B-Embedding

Sentence Similarity • 2B • Updated 23 days ago • 332 • 6
BidirLM/BidirLM-1B-Embedding

Sentence Similarity • 1.0B • Updated 23 days ago • 538 • 3

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

about 2 hours ago

Continuous Latent Diffusion Language Model

Paper • 2605.06548 • Published May 7 • 85
Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 233
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7, 2025 • 158
Pretraining Language Models to Ponder in Continuous Space

Paper • 2505.20674 • Published May 27, 2025 • 3

google/diffusiongemma-26B-A4B-it

Image-Text-to-Text • 26B • Updated 21 days ago • 1.36M • 1.09k

Mellum2 model weights

about 1 month ago

JetBrains/Mellum2-12B-A2.5B-Thinking

Text Generation • 12B • Updated 19 days ago • 27.2k • 313
JetBrains/Mellum2-12B-A2.5B-Instruct

Text Generation • 12B • Updated 19 days ago • 8.31k • 77
JetBrains/Mellum2-12B-A2.5B-Thinking-SFT

Text Generation • 12B • Updated 19 days ago • 821 • 25
JetBrains/Mellum2-12B-A2.5B-Instruct-SFT

Text Generation • 12B • Updated 19 days ago • 361 • 14

Running

Agents

Featured

38

QwenScope

🔥

38

Explore and steer Qwen3 model features with interactive heatmaps
Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models

Paper • 2605.11887 • Published May 12 • 18
Qwen/SAE-Res-Qwen3.5-27B-W80K-L0_50

Updated May 13 • 680 • 38
Qwen/SAE-Res-Qwen3.5-2B-Base-W32K-L0_50

Updated May 13 • 129 • 13

Qwen/Qwen3-ASR-1.7B

Automatic Speech Recognition • 2B • Updated Jan 30 • 1.42M • 909
Qwen/Qwen3-ASR-0.6B

Automatic Speech Recognition • 0.9B • Updated Jan 30 • 903k • 310
Qwen/Qwen3-ForcedAligner-0.6B

Automatic Speech Recognition • 0.9B • Updated Jan 30 • 465k • 145
Running on Zero

Agents

Featured

141

Qwen3-ASR Demo

🎙

141

Transcribe audio to text with timestamps and visualization

Nemotron-Labs-Diffusion

A Tri-Mode Language Model Family Unifying Autoregressive, Diffusion, and Self-Speculation Decoding

nvidia/Nemotron-Labs-Diffusion-8B

Text Generation • 8B • Updated 28 days ago • 131k • 50
nvidia/Nemotron-Labs-Diffusion-VLM-8B

Image-Text-to-Text • 9B • Updated 28 days ago • 3.59k • 26
nvidia/Nemotron-Labs-Diffusion-14B

Text Generation • 14B • Updated 28 days ago • 13.1k • 148
nvidia/Nemotron-Labs-Diffusion-3B

Text Generation • 4B • Updated 28 days ago • 90k • 35

Our most capable model to date, designed for long-horizon work. Apache 2.0.

poolside/Laguna-M.1

Text Generation • 226B • Updated about 9 hours ago • 3.31k • 101
poolside/Laguna-M.1-base

Text Generation • 226B • Updated 13 days ago • 675 • 13
poolside/Laguna-M.1-NVFP4

Text Generation • 131B • Updated about 9 hours ago • 3.13k • 10
poolside/Laguna-M.1-FP8

Text Generation • 226B • Updated about 9 hours ago • 1.8k • 10

google/diffusiongemma-26B-A4B-it

Image-Text-to-Text • 26B • Updated 21 days ago • 1.36M • 1.09k

nex-agi/Nex-N2-Pro

Text Generation • 397B • Updated 20 days ago • 8.43k • 359
nex-agi/Nex-N2-mini

Text Generation • 35B • Updated 20 days ago • 17k • 265
nex-agi/Nex-N2-Pro-fp8

Text Generation • 397B • Updated 18 days ago • 3.03k • 16

Mellum2 model weights

about 1 month ago

JetBrains/Mellum2-12B-A2.5B-Thinking

Text Generation • 12B • Updated 19 days ago • 27.2k • 313
JetBrains/Mellum2-12B-A2.5B-Instruct

Text Generation • 12B • Updated 19 days ago • 8.31k • 77
JetBrains/Mellum2-12B-A2.5B-Thinking-SFT

Text Generation • 12B • Updated 19 days ago • 821 • 25
JetBrains/Mellum2-12B-A2.5B-Instruct-SFT

Text Generation • 12B • Updated 19 days ago • 361 • 14

Gemma 4 — DECKARD HERETIC, Multimodal & Speculators

Gemma 4 abliterated/quantized — DECKARD HERETIC 31B, SuperGemma4-26B multimodal, 26B-A4B MoE, plus EAGLE3/DFlash drafters.

AEON-7/Gemma-4-31B-it-DECKARD-HERETIC-Uncensored-NVFP4

Text Generation • 18B • Updated 11 days ago • 3.11k • 10
AEON-7/Gemma-4-31B-it-DECKARD-HERETIC-Uncensored-NVFP4-SVDQuant

Text Generation • 19B • Updated 11 days ago • 846 • 2
AEON-7/Gemma-4-26B-A4B-it-Uncensored-NVFP4

Text Generation • 15B • Updated 11 days ago • 74.3k • 21
AEON-7/Gemma-4-E4B-DECKARD-HERETIC-Uncensored-NVFP4

Text Generation • 6B • Updated 11 days ago • 244 • 1

Running

Agents

Featured

38

QwenScope

🔥

38

Explore and steer Qwen3 model features with interactive heatmaps
Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models

Paper • 2605.11887 • Published May 12 • 18
Qwen/SAE-Res-Qwen3.5-27B-W80K-L0_50

Updated May 13 • 680 • 38
Qwen/SAE-Res-Qwen3.5-2B-Base-W32K-L0_50

Updated May 13 • 129 • 13

BidirLM is a family of 5 frontier bidirectional encoders, including an omnimodal variant at 2.5B.

BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs

Paper • 2604.02045 • Published Apr 2 • 39
BidirLM/BidirLM-Omni-2.5B-Embedding

Sentence Similarity • 2B • Updated May 12 • 349 • 45
BidirLM/BidirLM-1.7B-Embedding

Sentence Similarity • 2B • Updated 23 days ago • 332 • 6
BidirLM/BidirLM-1B-Embedding

Sentence Similarity • 1.0B • Updated 23 days ago • 538 • 3

Qwen/Qwen3-ASR-1.7B

Automatic Speech Recognition • 2B • Updated Jan 30 • 1.42M • 909
Qwen/Qwen3-ASR-0.6B

Automatic Speech Recognition • 0.9B • Updated Jan 30 • 903k • 310
Qwen/Qwen3-ForcedAligner-0.6B

Automatic Speech Recognition • 0.9B • Updated Jan 30 • 465k • 145
Running on Zero

Agents

Featured

141

Qwen3-ASR Demo

🎙

141

Transcribe audio to text with timestamps and visualization

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

about 2 hours ago

Continuous Latent Diffusion Language Model

Paper • 2605.06548 • Published May 7 • 85
Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 233
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7, 2025 • 158
Pretraining Language Models to Ponder in Continuous Space

Paper • 2505.20674 • Published May 27, 2025 • 3

Nemotron-Labs-Diffusion

A Tri-Mode Language Model Family Unifying Autoregressive, Diffusion, and Self-Speculation Decoding

nvidia/Nemotron-Labs-Diffusion-8B

Text Generation • 8B • Updated 28 days ago • 131k • 50
nvidia/Nemotron-Labs-Diffusion-VLM-8B

Image-Text-to-Text • 9B • Updated 28 days ago • 3.59k • 26
nvidia/Nemotron-Labs-Diffusion-14B

Text Generation • 14B • Updated 28 days ago • 13.1k • 148
nvidia/Nemotron-Labs-Diffusion-3B

Text Generation • 4B • Updated 28 days ago • 90k • 35

Previous
1
...
5
6
7
8
9
...
21,493
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs