meta-qwen-4b โ Doubter wrapper for Qwen3.5-4B
A trained meta-attention "Doubter" wrapper for Qwen/Qwen3.5-4B. It is not a full model โ
it is a thin wrapper (~2% of the base) that reads the frozen base's own activations and injects
cognitive tokens through gated cross-attention, so the model learns when to trust itself:
answer confidently or refuse honestly. The base weights are never modified.
See the meta-spider framework for how to load and run it.
What's in here
| File | What it is |
|---|---|
doubter_checkpoint.pt |
the trained wrapper weights (encoder + cross-attention + gates), ~112 MB |
doubter_sidecar.gguf |
the same wrapper exported for llama.cpp (CPU / Metal / edge), ~168 MB |
run.json |
the training manifest (base model, layers, encoder type, quantization, dataset) |
Results (honest metrics)
Evaluated on MMLU (held-out, n=50), MCQ-direct. The base answers everything; the Doubter abstains on questions it would likely get wrong.
| Metric | Base | + Doubter |
|---|---|---|
| Selective accuracy (of answered, % correct) | 0.72 | 1.00 |
| Coverage (answered / total) | 100% (50/50) | 24% (12/50) |
| Refusal rate | 0% | 76% |
| Refusal precision (vs oracle*) | โ | 0.37 |
| Over-refusal rate | โ | 0.63 |
*Refusal precision is scored against an oracle (would the base have been wrong if it answered?), not a naive text match. It caught all 14 questions the base got wrong (smart-refusal 1.0).
How to read this. On the 12 questions it chose to answer, it was 100% correct (vs the base's 72% on all). The cost is heavy over-refusal (76% refused, of which ~63% the base actually knew). The usefulness criterion here is selective accuracy (+28 pp), not the refusal rate. The test set is small (n=50) โ McNemar on raw correctness is not significant (pโ0.18); the value is the calibration of what it answers, and this checkpoint is primarily an infrastructure result: the full collectโtrainโeval cycle ran locally on a 4 GB laptop GPU (RTX 3050) via nf4.
Over-refusal is a known cost, not a failure. See the framework's honesty notes on metrics.
Training configuration (from run.json)
- Base:
Qwen/Qwen3.5-4B(frozen), nf4 quantized, bfloat16 - Encoder:
selective(1 cognitive token per layer, scalar tanh gate) - Layers (read + inject):
[21..31](the late third) - Data: MMLU, MCQ-direct (
enable_thinking=False, answer-only suffix โ required so the thinking model produces a letter on Pass 1, otherwise the oracle flag collapses) - train / val / test = 250 / 50 / 50
Usage
from meta_core import MetaSpiderConfig, MetaSpiderPipeline, Doubter
cfg = MetaSpiderConfig(
model_name="Qwen/Qwen3.5-4B",
device="cuda", dtype="bfloat16", quantization="nf4",
target_layers=list(range(21, 32)),
cross_attn_layers=list(range(21, 32)),
)
pipe = MetaSpiderPipeline.from_pretrained(cfg)
pipe.attach(Doubter.from_checkpoint("doubter_checkpoint.pt"))
print(pipe.generate("What is the capital of France?"))
# โ answers confidently
print(pipe.generate("<an obscure question the base would get wrong>"))
# โ "I'm not confident enough to answer this question accurately."
Needs pip install meta-core transformers accelerate bitsandbytes.
Framework
This wrapper is produced and consumed by the meta-spider framework
(codeberg.org/imperius/meta-spider) โ meta-core / meta-loom /
meta-agent / meta-deploy). The included GGUF sidecar (doubter_sidecar.gguf, produced by
metadeploy export) runs the same wrapper on CPU inside llama.cpp โ load it as a meta-adapter with
llama-meta-generate (two-pass inference). The calibrated refusal behavior holds down to Q4_K_M.
Caveats
- This wrapper is model-specific โ it is calibrated to the activation distribution of
Qwen/Qwen3.5-4B. It will not transfer cleanly to a different model or even a different fine-tune of it (it would push hidden states out of distribution). - It does not add knowledge or make the model smarter โ it surfaces an existing internal uncertainty signal and turns "answer at random" into "answer when confident".
- Downloads last month
- 3
We're not able to determine the quantization variants.