Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
RiverRider 
posted an update 17 days ago
Post
221
A single forward pass of the frozen Qwen-2.5-7B model plus a lightweight classifier reaches 0.866 plus or minus 0.011 AUC on the full TruthfulQA-MC2 benchmark. No adapters. No fine-tuning. No extra parameters on the backbone.

This is the strongest hidden-state truthfulness detector reported on the benchmark to date.

The same latent features that the SRT-NLA-AV-v1 demo reads out as coherent natural-language verbalizations turn out to be rich enough to support production-grade auditing for honesty versus hallucination. The internal semiotic infrastructure we have been exploring in public is already information-dense enough to solve hard downstream problems with almost trivial overhead.

You can watch the underlying latent geometry in action right here:
RiverRider/srt-nla-av-v1-demo

Full code, artifacts, and reproduction steps are in the repository:
https://github.com/space-bacon/SRT

Try the Glass Box
RiverRider/srt-nla-demo
In this post