Independent eval architect
Inference drift • boundary intelligence • model safety
Scope
termination and stopping behavior
instruction boundary discipline
inference drift and collapse surfaces
VL-JEPA failure modes (video reasoning)
alignment via behavior, not narrative
Artifacts
30+ public micro-benchmarks (20-row probes)
lightweight probes for internal evaluation
focus on failure detection over accuracy
Not trying to sell models
Not training, not fine-tuning, not a lab.
Evaluating how systems fall apart — before deployment.
Open for collaboration
Research groups • safety teams • eval units • robotics • multimodal inference
Contact
email / HF messages
team@loopwell.ai