Text Generation
Transformers
English
mergekit
coding
agentic
reasoning
qwen2.5
llama-3.1
Merge
sovereign
Instructions to use Vaultkeeper/Sovereign-Code with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Vaultkeeper/Sovereign-Code with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Vaultkeeper/Sovereign-Code")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Vaultkeeper/Sovereign-Code", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Vaultkeeper/Sovereign-Code with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Vaultkeeper/Sovereign-Code" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Vaultkeeper/Sovereign-Code", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Vaultkeeper/Sovereign-Code
- SGLang
How to use Vaultkeeper/Sovereign-Code with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Vaultkeeper/Sovereign-Code" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Vaultkeeper/Sovereign-Code", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Vaultkeeper/Sovereign-Code" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Vaultkeeper/Sovereign-Code", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Vaultkeeper/Sovereign-Code with Docker Model Runner:
docker model run hf.co/Vaultkeeper/Sovereign-Code
| license: other | |
| language: | |
| - en | |
| tags: | |
| - text-generation | |
| - mergekit | |
| - coding | |
| - agentic | |
| - reasoning | |
| - qwen2.5 | |
| - llama-3.1 | |
| - transformers | |
| - merge | |
| - sovereign | |
| base_model: | |
| - Qwen/Qwen2.5-Coder-7B-Instruct | |
| - meta-llama/Meta-Llama-3.1-8B-Instruct | |
| pipeline_tag: text-generation | |
| <div align="center" style="display: flex; justify-content: center; align-items: center; gap: 40px; flex-wrap: wrap; margin: 2em 0;"> | |
| <img src="https://huggingface.co/Vaultkeeper/Sovereign-Code/resolve/main/Sovereign-Code-logo.png" alt="Sovereign-Code" width="400" style="max-height: 400px;" /> | |
| <div style="text-align: center;"> | |
| <h1 style="margin: 0; font-size: 2.8em; line-height: 1.1;">Sovereign-Code </h1> | |
| <p style="margin: 8px 0 0; font-size: 1.4em; opacity: 0.9; font-weight: 500;">by VaultAI</p> | |
| </div> | |
| <img src="https://huggingface.co/Vaultkeeper/ouroboros-next/resolve/main/vaultai-logo.png" alt="VAULTAI" width="300" style="max-height: 300px;" /> | |
| </div> | |
| <div align="center" style="margin: 2em 0; padding: 25px; border: 1px solid #444; border-radius: 12px; background: linear-gradient(145deg, #0a0a0a, #141414); box-shadow: 0 10px 30px rgba(0,0,0,0.7);"> | |
| <h2 style="margin: 0; font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif; color: #ffffff; letter-spacing: 3px; font-size: 1.6em; text-transform: uppercase;"> | |
| Deployment Status: <span style="color: #ff3333; animation: pulse 1.5s infinite;">UNRELEASED</span> | |
| </h2> | |
| <div style="width: 85%; background-color: #1a1a1a; border-radius: 20px; margin: 25px 0; height: 14px; overflow: hidden; border: 1px solid #333; position: relative;"> | |
| <div style="width: 100%; height: 100%; background: linear-gradient(90deg, #333, #ffcc00, #333); background-size: 200% 100%; animation: loading-shimmer 2s linear infinite; border-radius: 20px;"></div> | |
| </div> | |
| <p style="margin: 0; font-size: 1.1em; color: #aaa; font-weight: 600; letter-spacing: 1px; text-transform: uppercase;"> | |
| <span style="color: #ffcc00;">[ PRE-ALPHA ]</span> SOVEREIGN-CODE & CORPUS-CALLOSUM | ARCHITECTING... | |
| </p> | |
| <style> | |
| @keyframes pulse { | |
| 0% { opacity: 1; text-shadow: 0 0 5px #ff3333; } | |
| 50% { opacity: 0.3; text-shadow: 0 0 0px #ff3333; } | |
| 100% { opacity: 1; text-shadow: 0 0 5px #ff3333; } | |
| } | |
| @keyframes loading-shimmer { | |
| 0% { background-position: 200% 0; } | |
| 100% { background-position: -200% 0; } | |
| } | |
| </style> | |
| </div> | |
| <br> | |
| ### ✅ **Execution, Absolute.** | |
| While most models are built to converse, **Sovereign-Code** is built to execute. It is a specialized, cold-logic engine designed for a single purpose: high-fidelity technical output. | |
| Engineered by VaultAI, Sovereign-Code is a custom **32-Layer Hybrid** model. It utilizes an aggressive architectural "passthrough" to bridge the deep structural coding intelligence of **Qwen 2.5 Coder** with the rigid, high-instruction-following cortex of **Llama 3.1**. It does not offer opinions; it delivers functional syntax. | |
| ## 🧠 Architecture & Identity: The Logic Terminal | |
| Sovereign-Code is a "Frankenmerge" that ignores standard architectural safety to achieve peak performance. By stacking disparate layers, VaultAI has created a model that processes raw intent through a coding-heavy base before filtering it through an elite instruction-following top-layer. | |
| **Key Capabilities:** | |
| * **Deterministic Syntax:** Optimized for zero-fluff code generation across Python, C++, Rust, and Mojo. | |
| * **Tattooed Monologue:** Hardcoded via a custom Jinja2 template to engage in a mandatory three-phase internal processing loop inside `<think>` tags before every output. | |
| * **Hardware Optimized:** Designed for dual-GPU configurations (Polaris/gfx803) using `llama.cpp` and Vulkan backends. | |
| ### ⚡ Performance & Benchmarks (Estimated) | |
| Sovereign-Code is designed for maximum throughput on local consumer hardware (RX 570/580 8GB setups). | |
| | Metric | Target Hardware | VRAM Footprint | Logic Mode | | |
| | :--- | :--- | :--- | :--- | | |
| | **Quantization** | Q4_K_M (GGUF) | ~9.2 GB | **Full GPU Offload** | | |
| | **Context Length** | 32,768 Tokens | High Headroom | Optimized for Repo-level Debugging | | |
| ### Standardized Accuracy Benchmarks | |
| *Benchmarks are currently queued for evaluation.* | |
| | Benchmark | Focus Area | Score | Status | | |
| | :--- | :--- | :--- | :--- | | |
| | **HumanEval** | Coding & Logic | *TBD* | ⏳ Pending Eval | | |
| | **MBPP** | Python Programming | *TBD* | ⏳ Pending Eval | | |
| | **GSM8k** | Mathematical Reasoning | *TBD* | ⏳ Pending Eval | | |
| ## Model Details | |
| - **Type**: Causal Language Model (Hybrid Passthrough) | |
| - **Base Architecture**: Qwen 2.5 (7B) + Llama 3.1 (8B) | |
| - **Total Parameters**: ~15B (Effective density via Layer Stacking) | |
| - **Merge Method**: Passthrough / Frankenmerge | |
| - **Weights Composition**: | |
| - **Base (Layers 0-16)**: [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) | |
| - **Cortex (Layers 16-32)**: [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) | |
| - **License**: Other (See Base Model Licenses) | |
| ## Why Sovereign-Code? | |
| - **The Execution Engine:** No conversational "As an AI..." filler. | |
| - **Analytical Grounding:** The built-in `<think>` protocol forces the model to debug its own code conceptually before writing a single line. | |
| - **Agentic Ready:** Optimized for tool-calling and autonomous development workflows. | |
| <div align="center" style="display: flex; justify-content: center; align-items: center; gap: 40px; flex-wrap: wrap; margin: 2em 0;"> | |
| <img src="https://huggingface.co/Vaultkeeper/ouroboros-next/resolve/main/vaultai-logo.png" alt="VAULTAI" width="100" style="max-height: 100px;" /> | |
| </div> |