vedaco
/

tera-v2

custom-architecture

non-transformer

Model card Files Files and versions

TERA V2

A language model built entirely from scratch. No pretrained weights. No standard transformers.

Architecture

TERA V2 uses a custom non-transformer architecture with the following components:

Time Mix for sequence mixing
Token Shift for position encoding
GroupNorm for normalization
Channel Mix with Squared ReLU for feed-forward
Stochastic Depth for regularization
Untied Embeddings

Model Specifications

Specification	Value
Parameters	~726K
Vocabulary Size	510
Context Length	32 tokens
Hidden Size (d_model)	128
Attention Heads	4
Layers	3
Framework	TensorFlow / Keras

Training Details

Trained from scratch on clean question-answer pairs
No pretrained weights were used at any stage
Custom BPE-lite tokenizer trained on the same data
Loss function: Sigmoid cross-entropy
Optimizer: Adam with cosine learning rate schedule
Training format: Q: question / A: answer

How To Use

Download all files from this repository
Install TensorFlow
Load the tokenizer from tokenizer.json
Build the model using model_config.json
Load weights from model.weights.h5
Format input as: Q: your question here / A:

Example Input and Output

Input: Q: What is the sun?

Output: The sun is a star at the center of our solar system.

Input: Q: Hello

Output: Hello! How can I help you today?

Files Included

File	Description
model.py	Model architecture code
tokenizer.py	Tokenizer class code
model_config.json	Model hyperparameters
tokenizer.json	Trained tokenizer vocabulary
model.weights.h5	Trained model weights
training_data.py	Training data used
loss_history.json	Training loss over epochs
training_state.json	Final training stats

Live Demo

Try TERA V2 live at: https://huggingface.co/spaces/vedaco/tera.v2

Created By

Vedaco Team

License

Apache 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support