# composer_replication

The Composer 2.5 Replication Framework, packaged for `pip install`.

This package re-exports the verified APIs that live in the
[`spikes/`](../spikes/) directory of the parent repository, so that downstream
code can `import composer_replication` instead of poking at `sys.path`.

## Package map

| module | source spike | purpose |
|---|---|---|
| `composer_replication.loss` | spike 006 | Free `compose_loss(model, batch, ...)` 3-channel loss composer + `LossComponents` dataclass |
| `composer_replication.batch` | spike 006 | `build_batch(tokenizer)` — real chat-template batch from any HF tokenizer |
| `composer_replication.opsd` | spike 005 | `generalized_jsd_loss` (verified port of `siyan-zhao/OPSD`) |
| `composer_replication.teacher_replay` | spike 001/005 | `replay_trace`, `extract_dpo_pairs`, `TraceState`, `TeacherSpec` (multi-teacher OpenRouter replay) |
| `composer_replication.hint_generator` | spike 005 | Hint-text construction at error sites for SDPO channel |
| `composer_replication.trainer` | spike 005 | `ComposerReplicationTrainer` (TRL `GRPOTrainer` subclass with the 3 channels) |
| `composer_replication.ingestion` | spike 007 | `ClaudeCodeIngester` (Claude Code session JSONL → `TraceState`) |
| `composer_replication.diloco` | spike 008 | `make_diloco_outer_loop` (wraps `torchft.local_sgd.DiLoCo`) |

## Why a package on top of spikes?

The spikes are research artifacts: each one has its own `README.md`, tests,
verdict, and a `sys.path` hack to find sibling modules. They live forever as
verification harnesses.

Most users want to `pip install -e . && python my_training_script.py`. This
package is the pip-installable face of the framework. The two surfaces stay
in sync because the package modules are 1:1 copies of the spike modules with
only the import paths changed (sibling-relative → package-absolute).

## Quickstart

See [`examples/qwen_05b_quickstart/`](../examples/qwen_05b_quickstart/) at
the repo root.