RtaForge / Anvaya-Rabbit-2.7B

Architecture: RtaSSM (Custom non-transformer State Space Model)
Parameters: 2.7 Billion
Topology: fu-64 (64 layers, d_model=2560)


Current Release: v0.6 — Logic Domain

This release reflects training on a curated logic and reasoning corpus. The model is an intermediate checkpoint; full benchmark evaluation and public release are gated on completing the training curriculum.

Files:

Format Path
PyTorch base/Anvaya-Rabbit-2.7B-0.6-base.pt

⚠️ v0.1-alpha Weights Deprecated

Do not use the v0.1-alpha weights. The v0.6 checkpoint is the correct starting point.

A silent corruption occurred during weight migration in v0.1-alpha:

  1. Tensor Layout Mismatch: Migration assumed WIDE layout for in_proj. The Mamba2-2.7B HF checkpoint uses TALL [10576, 2560], silently loading gating values into signal matrices.
  2. Vocabulary Alignment: Shape mismatch caused the embedding layer to be skipped. The model ran on random embeddings.

Result: CE ~46 (4× worse than random baseline of 10.83). Fixed in v0.5; v0.6 builds on that foundation.


Model Highlights

  • Architecture: RtaSSM — a custom non-transformer State Space Model, not derived from any public SSM codebase
  • Scale: 2.7 Billion Parameters
  • Training: Multi-phase curriculum (logic → math → science/STEM → instruct fine-tune)
  • Governance: Constitutional training gate — every weight update validated before commit
  • Tokenizer: EleutherAI/gpt-neox-20b (vocab 50,280)

Roadmap

Version Domain Status
v0.6 Logic & reasoning Released
v0.7 Math In progress
v0.8 Science / STEM Planned
v1.0 Full curriculum + instruct SFT + benchmarks Planned

Anvaya: Sovereignty as a Service
RtaForge OPC Private Limited · ANVAYA Sovereign AI Research

Downloads last month
154
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support