RtaForge / Anvaya-Rabbit-2.7B
Architecture: RtaSSM (Custom non-transformer State Space Model)
Parameters: 2.7 Billion
Topology: fu-64 (64 layers, d_model=2560)
Current Release: v0.6 — Logic Domain
This release reflects training on a curated logic and reasoning corpus. The model is an intermediate checkpoint; full benchmark evaluation and public release are gated on completing the training curriculum.
Files:
| Format | Path |
|---|---|
| PyTorch | base/Anvaya-Rabbit-2.7B-0.6-base.pt |
⚠️ v0.1-alpha Weights Deprecated
Do not use the v0.1-alpha weights. The v0.6 checkpoint is the correct starting point.
A silent corruption occurred during weight migration in v0.1-alpha:
- Tensor Layout Mismatch: Migration assumed WIDE layout for
in_proj. The Mamba2-2.7B HF checkpoint uses TALL[10576, 2560], silently loading gating values into signal matrices. - Vocabulary Alignment: Shape mismatch caused the embedding layer to be skipped. The model ran on random embeddings.
Result: CE ~46 (4× worse than random baseline of 10.83). Fixed in v0.5; v0.6 builds on that foundation.
Model Highlights
- Architecture: RtaSSM — a custom non-transformer State Space Model, not derived from any public SSM codebase
- Scale: 2.7 Billion Parameters
- Training: Multi-phase curriculum (logic → math → science/STEM → instruct fine-tune)
- Governance: Constitutional training gate — every weight update validated before commit
- Tokenizer: EleutherAI/gpt-neox-20b (vocab 50,280)
Roadmap
| Version | Domain | Status |
|---|---|---|
| v0.6 | Logic & reasoning | Released |
| v0.7 | Math | In progress |
| v0.8 | Science / STEM | Planned |
| v1.0 | Full curriculum + instruct SFT + benchmarks | Planned |
Anvaya: Sovereignty as a Service
RtaForge OPC Private Limited · ANVAYA Sovereign AI Research
- Downloads last month
- 154