Parcae-xlarge-1.3B
Parcae is a novel stable, looped architecture for language modeling. Unlike traditional fixed-depth architectures that scale by increasing parameter counts, Parcae increases FLOPs by sending activations through a block of layers in a loop. It addresses instability issues in prior looped models by recasting looping as a nonlinear time-variant dynamical system and constraining the spectral norm of injection parameters.
- Paper: Parcae: Scaling Laws For Stable Looped Language Models
- Project Page: https://sandyresearch.github.io/parcae/
- Repository: https://github.com/sandyresearch/parcae
Installation
To use this model, you can install the parcae-lm package:
pip install parcae-lm
Usage
You can load the pretrained weights using the parcae_lm library:
import parcae_lm
# Load this pretrained model from HuggingFace
model = parcae_lm.from_pretrained("SandyResearch/parcae-xlarge-1_3b")
Model Details
This specific checkpoint is the 1.3B parameter variant of Parcae, trained on the FineWeb-Edu dataset.
| Model | Parameters | Prelude | Core | Coda | Model dim. | Recurrence |
|---|---|---|---|---|---|---|
| Parcae-1.3B | 1.3B | 8 | 8 | 8 | 1536 | 8 |
Note: These are base models without any form of downstream modification (instruction tuning, etc.).
Citation
@misc{prairie2026parcaescalinglawsstable,
title={Parcae: Scaling Laws For Stable Looped Language Models},
author={Hayden Prairie and Zachary Novack and Taylor Berg-Kirkpatrick and Daniel Y. Fu},
year={2026},
eprint={2604.12946},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2604.12946},
}
References
This code-base was built on karpathy/nanochat, seal-rg/recurrent-pretraining, and Lightning-AI/litgpt.
- Downloads last month
- 187