CEM888AI
/

cem888-benchmarks

Model card Files Files and versions

CEM888.AI MemoryAgentBench Results

99.9% AR · 77.2% BEAM — Filesystem-native memory agent on MemoryAgentBench (ICLR 2026).

Scores

Benchmark	CEM888 (Vetta)	Best Published
AR Retrieval	99.9%	71.5% (Hindsight)
BEAM Memory	77.2%	64.1% (Hindsight honest)

AR: 2,000 retrieval questions — 2 misses out of 2,000
BEAM: 200 multi-category memory questions

Architecture

Model: DeepSeek V4 Pro
Retrieval: Filesystem-first, deterministic search — no RAG, no embeddings, no vector DB
Memory: Agent-native sovereign vault — the filesystem is ground truth
Deployment: Fully local. No cloud. No data leakage.

Contents

AR-Results-99.9pct.md — Full AR breakdown with all categories
Vetta-BEAM-Honest-77.2pct.md — BEAM methodology and per-category scores
vetta_beam_v9_results.jsonl — All 200 BEAM questions with scores
vetta_live_results.jsonl — All 2,000 AR questions with scores

Links

GitHub: CEM888.AI-Site
Reddit: r/LocalLLaMA discussion
Contact: creator@cem888.ai

Building this solo. Looking for sponsors and collaborators.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for CEM888AI/cem888-benchmarks

Rubik's Abstract Polytopes

Paper • 2502.13518 • Published Feb 19, 2025