Collections
Discover the best community collections!
Collections trending this week
-
AEON-7/Gemma-4-31B-it-DECKARD-HERETIC-Uncensored-NVFP4
Text Generation • 18B • Updated • 3.11k • 10 -
AEON-7/Gemma-4-31B-it-DECKARD-HERETIC-Uncensored-NVFP4-SVDQuant
Text Generation • 19B • Updated • 846 • 2 -
AEON-7/Gemma-4-26B-A4B-it-Uncensored-NVFP4
Text Generation • 15B • Updated • 74.3k • 21 -
AEON-7/Gemma-4-E4B-DECKARD-HERETIC-Uncensored-NVFP4
Text Generation • 6B • Updated • 244 • 1
-
BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs
Paper • 2604.02045 • Published • 39 -
BidirLM/BidirLM-Omni-2.5B-Embedding
Sentence Similarity • 2B • Updated • 349 • 45 -
BidirLM/BidirLM-1.7B-Embedding
Sentence Similarity • 2B • Updated • 332 • 6 -
BidirLM/BidirLM-1B-Embedding
Sentence Similarity • 1.0B • Updated • 538 • 3
-
Continuous Latent Diffusion Language Model
Paper • 2605.06548 • Published • 85 -
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 233 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 158 -
Pretraining Language Models to Ponder in Continuous Space
Paper • 2505.20674 • Published • 3
-
JetBrains/Mellum2-12B-A2.5B-Thinking
Text Generation • 12B • Updated • 27.2k • 313 -
JetBrains/Mellum2-12B-A2.5B-Instruct
Text Generation • 12B • Updated • 8.31k • 77 -
JetBrains/Mellum2-12B-A2.5B-Thinking-SFT
Text Generation • 12B • Updated • 821 • 25 -
JetBrains/Mellum2-12B-A2.5B-Instruct-SFT
Text Generation • 12B • Updated • 361 • 14
-
QwenScope
🔥38Explore and steer Qwen3 model features with interactive heatmaps
-
Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models
Paper • 2605.11887 • Published • 18 -
Qwen/SAE-Res-Qwen3.5-27B-W80K-L0_50
Updated • 680 • 38 -
Qwen/SAE-Res-Qwen3.5-2B-Base-W32K-L0_50
Updated • 129 • 13
-
Qwen/Qwen3-ASR-1.7B
Automatic Speech Recognition • 2B • Updated • 1.42M • 909 -
Qwen/Qwen3-ASR-0.6B
Automatic Speech Recognition • 0.9B • Updated • 903k • 310 -
Qwen/Qwen3-ForcedAligner-0.6B
Automatic Speech Recognition • 0.9B • Updated • 465k • 145 -
Qwen3-ASR Demo
🎙141Transcribe audio to text with timestamps and visualization
-
nvidia/Nemotron-Labs-Diffusion-8B
Text Generation • 8B • Updated • 131k • 50 -
nvidia/Nemotron-Labs-Diffusion-VLM-8B
Image-Text-to-Text • 9B • Updated • 3.59k • 26 -
nvidia/Nemotron-Labs-Diffusion-14B
Text Generation • 14B • Updated • 13.1k • 148 -
nvidia/Nemotron-Labs-Diffusion-3B
Text Generation • 4B • Updated • 90k • 35
-
JetBrains/Mellum2-12B-A2.5B-Thinking
Text Generation • 12B • Updated • 27.2k • 313 -
JetBrains/Mellum2-12B-A2.5B-Instruct
Text Generation • 12B • Updated • 8.31k • 77 -
JetBrains/Mellum2-12B-A2.5B-Thinking-SFT
Text Generation • 12B • Updated • 821 • 25 -
JetBrains/Mellum2-12B-A2.5B-Instruct-SFT
Text Generation • 12B • Updated • 361 • 14
-
AEON-7/Gemma-4-31B-it-DECKARD-HERETIC-Uncensored-NVFP4
Text Generation • 18B • Updated • 3.11k • 10 -
AEON-7/Gemma-4-31B-it-DECKARD-HERETIC-Uncensored-NVFP4-SVDQuant
Text Generation • 19B • Updated • 846 • 2 -
AEON-7/Gemma-4-26B-A4B-it-Uncensored-NVFP4
Text Generation • 15B • Updated • 74.3k • 21 -
AEON-7/Gemma-4-E4B-DECKARD-HERETIC-Uncensored-NVFP4
Text Generation • 6B • Updated • 244 • 1
-
QwenScope
🔥38Explore and steer Qwen3 model features with interactive heatmaps
-
Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models
Paper • 2605.11887 • Published • 18 -
Qwen/SAE-Res-Qwen3.5-27B-W80K-L0_50
Updated • 680 • 38 -
Qwen/SAE-Res-Qwen3.5-2B-Base-W32K-L0_50
Updated • 129 • 13
-
BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs
Paper • 2604.02045 • Published • 39 -
BidirLM/BidirLM-Omni-2.5B-Embedding
Sentence Similarity • 2B • Updated • 349 • 45 -
BidirLM/BidirLM-1.7B-Embedding
Sentence Similarity • 2B • Updated • 332 • 6 -
BidirLM/BidirLM-1B-Embedding
Sentence Similarity • 1.0B • Updated • 538 • 3
-
Qwen/Qwen3-ASR-1.7B
Automatic Speech Recognition • 2B • Updated • 1.42M • 909 -
Qwen/Qwen3-ASR-0.6B
Automatic Speech Recognition • 0.9B • Updated • 903k • 310 -
Qwen/Qwen3-ForcedAligner-0.6B
Automatic Speech Recognition • 0.9B • Updated • 465k • 145 -
Qwen3-ASR Demo
🎙141Transcribe audio to text with timestamps and visualization
-
Continuous Latent Diffusion Language Model
Paper • 2605.06548 • Published • 85 -
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 233 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 158 -
Pretraining Language Models to Ponder in Continuous Space
Paper • 2505.20674 • Published • 3
-
nvidia/Nemotron-Labs-Diffusion-8B
Text Generation • 8B • Updated • 131k • 50 -
nvidia/Nemotron-Labs-Diffusion-VLM-8B
Image-Text-to-Text • 9B • Updated • 3.59k • 26 -
nvidia/Nemotron-Labs-Diffusion-14B
Text Generation • 14B • Updated • 13.1k • 148 -
nvidia/Nemotron-Labs-Diffusion-3B
Text Generation • 4B • Updated • 90k • 35