🚀 TIGER-OM (SKT-OM) - 13B MoE Agentic Model

Advanced 13B Mixture-of-Experts (MoE) Model optimized for Agentic RAG with Think Mode & Plugin Architecture.

Built for AMD Developer Hackathon 2026 using AMD Developer Cloud.

📊 Model Details

Model Name: TIGER-OM (SKT-OM)
Architecture: Mixture of Experts (MoE)
Total Parameters: 13B (Active parameters much lower due to MoE sparsity)
Base Models:
- Primary Base: Shrijanagain/ST-X-0
- Expert Integration: Mistral-7B
Format: Safetensors (Safe & Fast loading)
Quantization: FP16 / BF16 (Original) + Q4_K_M GGUF available in separate repo
Context Length: 8192 tokens
Training Hardware: AMD Developer Cloud GPUs ($100 developer credits)
Inference Optimized: ROCm 7.0 + vLLM + AMD MI300X

🌟 Key Features

True MoE Architecture — Sparse activation for better efficiency and performance
Think Mode Reasoning — Advanced Chain-of-Thought, Planning, Self-Reflection & Verification
Dynamic Plugin System — Intelligent routing to Code, Math, Search, Data Analysis plugins
Agentic Capabilities — Full LangGraph multi-agent workflow
Advanced RAG Integration — SKT RAG + Query Rewriting + Multi-hop + Reranking
Stateful Memory — Persistent conversation context

🏗️ Architecture Breakdown

TIGER-OM is built on a 13B MoE backbone:

Base: Shrijanagain/ST-X-0 (strong foundational model)
Experts: Fine-tuned using Mistral-7B as expert layers for specialized reasoning and tool-use capabilities
Router Network: Learned gating mechanism for expert selection
Think Mode Layer: Custom system prompt + reasoning controller
Plugin Head: Tool calling & execution layer

This hybrid approach (ST-X-0 + Mistral-7B experts) gives excellent reasoning, code understanding, and general intelligence while maintaining MoE efficiency.

📁 Files in this Repo (Safetensors)

model-00001-of-0000X.safetensors → Main model weights
config.json
tokenizer.json / tokenizer_config.json
generation_config.json
special_tokens_map.json
model.safetensors.index.json

All weights are in safe safetensors format — No pickle risk.

🚀 How to Use (Safetensors)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "Shrijanagain/TIGER-OM"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

prompt = """You are SKT-OM, an advanced agentic AI with Think Mode enabled.
User Query: Calculate training cost comparison and suggest best option..."""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.1
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🔗 Important Links

Live Demo: SKT-OM Space
GGUF Quantized (Q4_K_M): Shrijanagain/TIGER-GGUF
GitHub (RAG + ADK Code): SHRIJANAGAIN/SKT-AMD-FILES

🛠️ Technologies & Stack

Base Models: Shrijanagain/ST-X-0 + Mistral-7B Experts
RAG: SKT RAG + AMD ADK Kit
Agents: LangGraph
Hardware: AMD MI300X + ROCm 7.0
Inference: vLLM (FP16) + transformers (Safetensors)
Training: AMD Developer Cloud

⚡ Performance

Excellent balance of quality vs efficiency due to MoE architecture
Strong performance on reasoning, tool-use, code, and multi-step tasks
Significantly lower inference cost compared to dense 13B+ models

📌 Use Cases

Complex technical Q&A
Agentic workflows & tool calling
Research assistance
Code generation & debugging
Mathematical & logical reasoning
Comparative analysis
Data analysis with plugins

🏆 Hackathon

AMD Developer Hackathon 2026
Trained entirely on AMD Developer Cloud
Fully built in public with multiple technical updates.

📄 License

MIT License

Downloads last month: 111

Safetensors

Model size

13B params

Tensor type

BF16

Model tree for Shrijanagain/TIGER-OM

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Finetuned

(3292)

this model