Reinforcement Learning
Transformers
English
post-training
distillation
agentic-coding
composer-2.5
cursor
kimi-k2
grpo
dapo
diloco
openenv
trl
verl
research
methodology
Instructions to use Codeseys/composer-replication-framework with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Codeseys/composer-replication-framework with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Codeseys/composer-replication-framework", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Commit History
Wave 16: install ergonomics + gradient evidence + SDPO end-to-end example c0a5ab7
Wave 15: 4-angle multi-model self-critique caught 2 math BLOCKERs in primary loss kernels; fixed against upstream byte-for-byte + GSM8K example + ergonomics e5add15
Wave 13: serverless DiLoCo + replaysim normalization + 3 distillation losses + PRIME-RL + Monarch b266c31
Wave 12: close V1-V8 brief — GPU smoke, SDPO firing, real-trace e2e d88715c
Wave 11: cross-model adversarial review + honest down-revision f16fa23
Wave 10 — packaging: composer_replication is now pip-installable ac05fbf
Wave 6: vision validation self-audit (5/10 to 9/10 in 5 days, no GPU) 040eff8
baladithyab commited on
Wave 5: full publication-materials drafts (pre-experimental release set) 639a760
baladithyab commited on
Wave 4: data collator + loss composition smoke (38/38 tests pass) 157cdba
baladithyab commited on
Wave 3: integration architecture + spike-005 trainer skeleton (16 tests pass) fd77f74
baladithyab commited on
Integrate Cursor blog directly + audit research note + add SDPO/OPSD link 1cede23
baladithyab commited on