RtaForge / Anvaya-Rabbit-2.7B

Architecture: RtaSSM (Custom non-transformer State Space Model)
Parameters: 2.7 Billion
Topology: fu-64 (64 layers, d_model=2560)

Current Release: v0.6 — Logic Domain

This release reflects training on a curated logic and reasoning corpus. The model is an intermediate checkpoint; full benchmark evaluation and public release are gated on completing the training curriculum.

Files:

Format	Path
PyTorch	`base/Anvaya-Rabbit-2.7B-0.6-base.pt`

⚠️ v0.1-alpha Weights Deprecated

Do not use the v0.1-alpha weights. The v0.6 checkpoint is the correct starting point.

A silent corruption occurred during weight migration in v0.1-alpha:

Tensor Layout Mismatch: Migration assumed WIDE layout for in_proj. The Mamba2-2.7B HF checkpoint uses TALL [10576, 2560], silently loading gating values into signal matrices.
Vocabulary Alignment: Shape mismatch caused the embedding layer to be skipped. The model ran on random embeddings.

Result: CE ~46 (4× worse than random baseline of 10.83). Fixed in v0.5; v0.6 builds on that foundation.

Model Highlights

Architecture: RtaSSM — a custom non-transformer State Space Model, not derived from any public SSM codebase
Scale: 2.7 Billion Parameters
Training: Multi-phase curriculum (logic → math → science/STEM → instruct fine-tune)
Governance: Constitutional training gate — every weight update validated before commit
Tokenizer: EleutherAI/gpt-neox-20b (vocab 50,280)

Roadmap

Version	Domain	Status
v0.6	Logic & reasoning	Released
v0.7	Math	In progress
v0.8	Science / STEM	Planned
v1.0	Full curriculum + instruct SFT + benchmarks	Planned

Anvaya: Sovereignty as a Service
RtaForge OPC Private Limited · ANVAYA Sovereign AI Research

Downloads last month: 154