Ureola 50M β€” Base

"She didn't quite turn out to be the best but I'm not done cooking πŸ˜‚ β€” honestly she speaks better English than most of us, with only 15 hours spent learning 59 thousand new words from scratch." β€” Neon, creator


What is Ureola?

Ureola is an open-source decoder-only language model built entirely from scratch by Neon of NeonTech β€” a one-man development team based in Port Harcourt, Nigeria.

No fine-tuned base. No borrowed weights. Every parameter in this model was initialized randomly and trained from zero.

This is Ureola 50M Base β€” the first checkpoint. She understands questions. She just doesn't always know how to answer them yet. The instruct version is coming.


Model Details

Property Value
Parameters 49.9M
Architecture Decoder-only Transformer
Layers 8
Attention heads 8
Embedding dim 512
FFN hidden dim 2048 (SwiGLU)
Context length 512 tokens
Positional encoding RoPE (Rotary Position Embedding)
Normalization RMSNorm (pre-norm)
Activation SwiGLU
Weight tying Yes (embedding ↔ LM head)
Tokenizer Meta LLaMA BPE (32,004 tokens)
Precision float16 (trained), float32 (inference)
License Apache 2.0

Training Details

Property Value
Dataset OpenHermes 2.5 (200k samples)
Tokens seen ~59 Million
Training steps 20,000
Training time ~15 hours
Hardware Tesla T4 16GB (Kaggle free tier)
Optimizer AdamW (Ξ²1=0.9, Ξ²2=0.95)
Learning rate 3e-4 with cosine decay
Warmup steps 500
Batch size 32 Γ— grad accumulation 4 = 128 effective
Final val loss ~1.20

Architecture Highlights

Ureola uses a modern decoder-only transformer architecture with several design choices borrowed from state-of-the-art models:

  • RoPE β€” Rotary positional embeddings for better length generalization
  • SwiGLU β€” Gated activation function used in LLaMA, PaLM, and Mistral
  • RMSNorm β€” Pre-normalization for stable training
  • Weight tying β€” Embedding and LM head share weights, saving ~16M parameters
  • No bias β€” Cleaner, faster linear layers
  • Flash Attention β€” Via PyTorch's scaled_dot_product_attention

The entire architecture was designed and implemented from scratch in PyTorch.


Chat Format

Ureola uses a simple chat template:

<|system|>
You are Ureola, a helpful and friendly AI assistant made by Neon of NeonTech.
<|user|>
Hello! Who are you?
<|assistant|>

Usage

import torch
from transformers import LlamaTokenizer
from safetensors.torch import load_file

# Load tokenizer
tokenizer = LlamaTokenizer.from_pretrained("Neon-tech/Ureola-50M-base")
tokenizer.add_special_tokens({
    "additional_special_tokens": ["<|system|>", "<|user|>", "<|assistant|>", "<|end|>"]
})

# Load model weights
# (requires the UreolaMini architecture class from the model card)
weights = load_file("model.safetensors")
# see Spaces demo for full inference code

Honest Assessment

This is a base model trained on a single GPU for 15 hours. Here is what it can and cannot do:

Can:

  • Generate coherent, grammatically correct English
  • Follow conversational structure
  • Produce structured responses (lists, steps, paragraphs)
  • Understand the type of question being asked

Cannot (yet):

  • Reliably answer specific questions accurately
  • Follow instructions precisely
  • Know who it is without being told
  • Perform arithmetic or reasoning tasks

These limitations are expected for a 50M base model. The instruct fine-tuned version (coming soon) addresses instruction following directly.


What's Next

Ureola 50M Base      ← you are here
Ureola 50M Instruct  ← fine-tuned on GPT-4 quality instructions (coming soon)
Ureola 50M-T         ← thinking version with chain-of-thought (coming soon)
NeonTokenizer        ← custom BPE tokenizer for all future Ureola models
Ureola 100M          ← next scale (coming soon)

About

Neon is an independent developer and the founder of NeonTech, building open-source AI systems from Port Harcourt, Nigeria.

Ureola is the AI assistant powering Ureola β€” a general-purpose AI chat platform.

This model represents NeonTech's first step into open-weight language model development. Everything was built with limited compute, no institutional backing, and a lot of patience.

"All the stress. Everything. And I'm really proud of her." β€” Neon


Citation

@misc{ureola50m2025,
  title  = {Ureola 50M: A Decoder-only Language Model Trained from Scratch},
  author = {Neon, NeonTech},
  year   = {2025},
  url    = {https://huggingface.co/Neon-tech/Ureola-50M-base}
}

Built from scratch. Trained on a free GPU. Made in Nigeria. πŸ‡³πŸ‡¬

Downloads last month
210
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using Neon-tech/Ureola-50M-base 1