UEG β€” Universal Edge Gateway Classifier

A 35M parameter bidirectional transformer for intent classification and AI request routing.

Model Description

UEG classifies incoming user text into 22 intent classes across 5 routing tiers, plus a secondary 5-class language resource density classification. Both outputs come from a single forward pass.

  • Architecture: 6-layer bidirectional transformer encoder, 512 hidden dim, 8 attention heads
  • Parameters: ~35M
  • Max sequence length: 128 tokens
  • Tokenizer: Custom BPE trained on the UEG training corpus (32K vocab)
  • Languages: English, Arabic, Hindi, French, Spanish, Chinese, Swahili, Portuguese

Performance

Head Accuracy Macro F1
Intent (22 classes) 97.35% 0.9733
Resource Density (5 classes) 99.95% 0.9987

Usage

Via REST API (recommended)

import requests

r = requests.post("https://ueg-api.onrender.com/classify",
                  json={"text": "Write a Python function to sort a list"})
print(r.json())

Via ONNX Runtime

import onnxruntime as ort
import numpy as np
from tokenizers import Tokenizer
from huggingface_hub import hf_hub_download

# Load tokenizer
tok_path = hf_hub_download("rufatronics/ueg-classifier",
                            "tokenizer/tokenizer.json")
tokenizer = Tokenizer.from_file(tok_path)
tokenizer.enable_padding(pad_id=0, pad_token="[PAD]", length=128)
tokenizer.enable_truncation(max_length=128)

# Load ONNX model + data file (both needed)
onnx_path = hf_hub_download("rufatronics/ueg-classifier",
                             "export/ueg_model.onnx")
data_path = hf_hub_download("rufatronics/ueg-classifier",
                             "export/ueg_model.onnx.data")

sess = ort.InferenceSession(onnx_path,
                             providers=["CPUExecutionProvider"])

# Inference
enc  = tokenizer.encode("Write a Python function to reverse a string")
ids  = np.array([enc.ids], dtype=np.int64)
mask = np.array([enc.attention_mask], dtype=np.int64)

logits_intent, logits_resource = sess.run(None,
    {"input_ids": ids, "attention_mask": mask})

intent_class = np.argmax(logits_intent)

Files

File Description
checkpoint_best.pt PyTorch weights (best validation epoch)
checkpoint_latest.pt PyTorch weights (final epoch)
export/ueg_model.onnx ONNX model for production inference
export/ueg_model.onnx.data ONNX external data (required alongside .onnx)
export/config.json Architecture hyperparameters
export/benchmark.json Inference latency benchmark
tokenizer/tokenizer.json Tokenizer definition
tokenizer/tokenizer_config.json Tokenizer config with pad/cls/sep IDs
labels/intent_classes.json Intent class label mappings
labels/resource_classes.json Resource density class mappings

Training

Trained from scratch on 176K synthetic examples generated via the UEG data generation pipeline using Groq, Gemini, and Mistral free tiers. Three-phase training: warmup β†’ cosine decay β†’ head fine-tuning. Early stopping with patience=4.

Full training code: https://github.com/rufatronics/ueg-datagen

Citation

@misc{ueg2026,
  title={UEG: Universal Edge Gateway for AI Request Routing},
  author={Ahmad Garba},
  year={2026},
  url={https://huggingface.co/rufatronics/ueg-classifier}
}

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train rufatronics/ueg-classifier

Evaluation results