Qwable-5-27B-Coder
Update (2026-06-22): This card has been rewritten to tell the full story. The original release was deliberately under-documented as part of a point about hype versus evidence in local AI. The recipe below is the whole truth, and it always was, but the original framing did most of the work. That was the experiment. This is the debrief.
What this actually is
A Qwen3.6-27B base, lightly post-trained on 10 traces total:
- 5 high-quality traces taken from the top of a Fable 5 dataset
- 5 traces generated by Kimi 2.7 Coder
Trained on a single DGX Spark (GB10) setup in roughly 3 minutes.
That is the entire recipe. No large curated corpus, no multi-stage pipeline, no RL. Ten examples and three minutes, then shipped with a polished card, a generated banner, and a confident announcement.
Why this exists
It was easy to make this look like a "distilled agentic coder" worth downloading. The model card was clean, the teacher names (Fable, Kimi) carried weight, the name had a version number in it, and the framing implied far more work than 10 traces and 3 minutes. None of the individual claims were false. The impression they created was the thing under test.
As local AI gets more popular, the failure mode is obvious: minimal work, aggressive marketing, downloads and attention flowing toward whatever is framed best rather than whatever is measured best. The ecosystem currently rewards hype over rigor, and that is a problem the community has to solve from the inside.
So this is a working demonstration of how little it takes to manufacture credibility, released so the reveal can make the point more concretely than an argument would.
What you should actually do (with any model, not just this one)
- Test it yourself. Do not infer capability from teacher names or a nice card.
- Demand real evals. Look at the data volume and the methodology, not just which models the traces came from.
- Be suspicious of buzzword names and benchmark-maxxing. A version number and a strong-sounding name are not evidence.
- Prefer hardware-specific, reproducible, open evals over "just trust me, it's distilled from {impressive model}."
Open source can genuinely help take knowledge and privacy back. That only holds if the community is discerning. Empty promises compound, and we will all pay for them if we are not measured in our reactions.
Intended use
Educational and illustrative. Use this as a reference point for what a 10-trace, 3-minute post-train does and does not buy you, and as a prompt to build or demand better evaluation before trusting any release like it.
It is not recommended as a production coding model. It has not been evaluated against any held-out coding benchmark with a methodology I would stand behind, and at n=10 you should assume the behavioral change over the base is narrow and underdetermined.
Training details
| Field | Value |
|---|---|
| Base model | Qwen3.6-27B |
| Method | Full fine-tune |
| Data | 10 traces (5 Fable 5 seed traces + 5 Kimi 2.7 Coder generations) |
| Hardware | DGX Spark (GB10, 128GB unified) |
| Wall-clock | ~3 minutes |
Limitations
- Behavioral change over the base is driven entirely by 10 examples and is statistically underdetermined relative to the base model's existing capability.
- No contamination-checked benchmark numbers are provided, by design, because providing impressive-looking numbers without methodology is part of what this release is criticizing.
- Any apparent strength on inputs resembling the seed traces should not be generalized.
Attribution
- Base model: Qwen3.6-27B (see base model card for its license and terms)
- Seed data: Fable 5 dataset, Kimi 2.7 Coder generations
Citation / context
This release is part of an ongoing argument for demand-first, methodology-first evaluation in open local AI. If you want the evaluation side of the work rather than the cautionary side, that is where to look next.
- Downloads last month
- 125