πŸš€ SKT-OM (TIGER-OM) - Agentic RAG System

Advanced 13B Agentic RAG with Think Mode + Dynamic Plugins + LangGraph

Built for AMD Developer Hackathon 2026 on AMD Developer Cloud.


🌟 Project Overview

SKT-OM (also known as TIGER-OM) is a powerful 13B parameter fully agentic Retrieval-Augmented Generation (RAG) system. It goes far beyond traditional RAG by integrating:

  • Think Mode β€” Advanced multi-step reasoning engine
  • Dynamic Plugin Architecture β€” Intelligent tool selection & execution
  • LangGraph Multi-Agent Workflow β€” Stateful agent collaboration
  • SKT RAG β€” High-performance retrieval pipeline

The system takes natural language queries and returns intelligent, reasoned, and accurate responses with tool usage and verification.


πŸ“Š Model Details

  • Model Name: TIGER-OM (SKT-OM)
  • Parameters: 13 Billion
  • Base Model: Custom trained on AMD hardware
  • Quantization: Q4_K_M (Excellent balance between quality and size)
  • GGUF Format: Optimized for CPU + GPU inference
  • Training Hardware: AMD Developer Cloud GPUs ($100 credits)
  • Inference: ROCm 7.0 + vLLM (Full FP16) + GGUF (Q4_K_M)

Q4_K_M Version provides near FP16 level reasoning quality while being much more memory efficient and faster on consumer/pro hardware.


✨ Key Features

  • Think Mode Engine: Chain-of-Thought, Self-Reflection, Verification Loops, and Self-Critique
  • Plugin Ecosystem: Code Runner, Math Solver, Web Search, Data Analyzer, Document Parser + Custom Plugins
  • Advanced RAG: SKT RAG with query rewriting, multi-hop retrieval, reranking & contextual compression
  • Multi-Agent System: LangGraph powered stateful workflow
  • Memory: Persistent conversation state
  • Tool Use: Dynamic plugin routing based on query intent

πŸ”— Important Links


How It Works

graph TD
    A[User Query] --> B[Think Mode]
    B --> C[Decomposition & Planning]
    C --> D[Plugin Router]
    C --> E[SKT RAG Retrieval]
    D --> F[Execute Plugins]
    E --> G[Context Processing]
    F & G --> H[Verification Loop]
    H --> I[LangGraph Synthesis]
    I --> J[Final Response]

πŸ› οΈ Technologies Used

  • LLM: 13B TIGER-OM (Q4_K_M GGUF)
  • RAG Framework: SKT RAG + ADK Kit
  • Agent Framework: LangGraph
  • GPU Stack: ROCm 7.0 + AMD ADK Kit
  • Inference: vLLM (FP16) + llama.cpp (GGUF Q4_K_M)
  • Hardware: AMD MI300X
  • Cloud: AMD Developer Cloud

πŸš€ Quick Start - GGUF Q4_K_M

# Using llama.cpp
./llama-cli \
  -m tiger-om-q4_k_m.gguf \
  -p "Your complex query here..." \
  -n 1024 \
  -t 8 \
  --temp 0.7

Python Example (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
    model_path="tiger-om-q4_k_m.gguf",
    n_gpu_layers=-1,      # Use all GPU layers
    n_ctx=8192,
    verbose=False
)

response = llm.create_chat_completion(
    messages=[{"role": "user", "content": "Explain..."}],
    temperature=0.7,
    max_tokens=1024
)

print(response['choices'][0]['message']['content'])

πŸ“ Repository Structure

  • /skt_ai_labs β€” Core ADK + RAG integration
  • /plugins β€” Plugin system
  • /agents β€” LangGraph workflows
  • /examples β€” Ready-to-use examples
  • /docs β€” Architecture & guides

πŸ† Hackathon Information

  • Event: AMD Developer Hackathon 2026
  • Trained on: AMD Developer Cloud ($100 credits)
  • Built in Public: Regular technical updates shared
  • Goal: Showcasing powerful agentic AI on AMD ROCm ecosystem

πŸ“„ License

MIT

Downloads last month
86
GGUF
Model size
13B params
Architecture
SKT-AI-LABS
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Shrijanagain/TIGER-GGUF

Quantized
(247)
this model

Dataset used to train Shrijanagain/TIGER-GGUF