GuppyLM

GitHub  Dataset  Colab
LinkedIn Article  Medium Article

Browser Demo

GuppyLM β€” 9M Parameter Fish Chat Model

A ~9M parameter LLM trained from scratch that talks like a small fish named Guppy.

This project exists to show that training your own language model is not magic. One Colab notebook, 5 minutes, and you have a working LLM built from scratch.

Example

You> hi guppy
Guppy> hi there. i just found a nice spot near the rock. the temperature feels nice.

You> are you hungry
Guppy> yes. always yes. i will swim to the top right now. i promise to eat all of it.

You> what is the meaning of life
Guppy> food. the answer is always food.

You> tell me a joke
Guppy> what did the fish say when it hit the wall. dam.

You> goodnight guppy
Guppy> ok sleep time. i was following a bubble but now i'll stop. goodnight tank. goodnight water.

Architecture

Parameters 8.7M
Type Vanilla transformer (from scratch)
Layers 6
Hidden dim 384
Heads 6
FFN 768 (ReLU)
Vocab 4,096 (BPE)
Max sequence 128 tokens
Norm LayerNorm
Position Learned embeddings
LM head Weight-tied with embeddings

No GQA, no RoPE, no SwiGLU, no early exit. As simple as it gets.

Training

  • Data: 60K single-turn synthetic conversations across 60 topics
  • Steps: 10,000
  • Optimizer: AdamW (cosine LR schedule)
  • Hardware: T4 GPU (~5 min)
  • No system prompt β€” personality is baked into the weights

Usage

from inference import GuppyInference

engine = GuppyInference('checkpoints/best_model.pt', 'data/tokenizer.json')
r = engine.chat_completion([{'role': 'user', 'content': 'hi guppy'}])
print(r['choices'][0]['message']['content'])
# hi there. i just found a nice spot near the rock.

Links

License

MIT

Downloads last month
2,096
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support