25 65

Suraj

ghishadow

AI & ML interests

None yet

Recent Activity

upvoted a paper 11 days ago

Unified Latents (UL): How to train your latents

liked a model 12 days ago

facebook/sam3

upvoted an article 13 days ago

Small Language Models (SLM): A Comprehensive Overview

View all activity

Organizations

upvoted a paper 11 days ago

Unified Latents (UL): How to train your latents

Paper • 2602.17270 • Published 21 days ago • 57

liked a model 12 days ago

facebook/sam3

Mask Generation • Updated Nov 20, 2025 • 1.86M • 1.69k

upvoted an article 13 days ago

Article

Small Language Models (SLM): A Comprehensive Overview

Feb 22, 2025

•

138

liked a Space 13 days ago

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

📝

Who needs 1T parameters? Olympiad proofs with a 4B model

upvoted an article 14 days ago

Article

Bamba: Inference-Efficient Hybrid Mamba2 Model

Dec 18, 2024

•

upvoted an article 17 days ago

Article

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

20 days ago

•

481

liked 3 models 3 months ago

upvoted a collection 3 months ago

Ministral 3

Collection

Mistral Ministral 3: new multimodal models in Base, Instruct, and Reasoning variants, available in 3B, 8B, and 14B sizes. • 36 items • Updated about 10 hours ago • 30

liked 2 models 4 months ago

litert-community/Gemma3-1B-IT

Text Generation • Updated Jan 9 • 27.9k • 551

maya-research/maya1

Text-to-Speech • Updated Nov 12, 2025 • 91.3k • 870

upvoted a paper 5 months ago

Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published Oct 17, 2025 • 49

liked 2 models 5 months ago

rednote-hilab/dots.ocr

Image-Text-to-Text • Updated Oct 31, 2025 • 238k • 1.27k

openai/gpt-oss-20b

Text Generation • 22B • Updated Aug 26, 2025 • 7.47M • • 4.45k

upvoted an article 6 months ago

Article

The Hacker's Guide to Building an AI Supercluster

Aug 31, 2025

•

liked a Space 6 months ago

The Ultra-Scale Playbook

🌌

3.74k

The ultimate guide to training LLM on large GPU Clusters

upvoted a collection 7 months ago

Gemma 3-270m

Collection

Collection of models for Gemma 3-270m • 4 items • Updated Dec 16, 2025 • 21

liked a Space 7 months ago

Wllama

🦙

Run GGUF directly on your browser!

liked a model 7 months ago

google/gemma-3-270m

Text Generation • Updated Aug 14, 2025 • 95.7k • 994

Suraj

AI & ML interests

Recent Activity

Organizations

ghishadow's activity

Small Language Models (SLM): A Comprehensive Overview

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

Bamba: Inference-Efficient Hybrid Mamba2 Model

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

The Hacker's Guide to Building an AI Supercluster

The Ultra-Scale Playbook

Wllama