YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
AI Internet Diagnostic
Tells you the specific reason your Wi-Fi just dropped β evidence-grounded, confidence-scored attribution like "your school's 802.1X session expired at 09:14:23 β here are the three telemetry signals that prove it."
Results
Macro F1: 0.974 (synthetic) Β· pending (real, Reality Anchor dogfood) Β· ECE 0.28
Architecture
Mermaid source (renders on GitHub)
flowchart LR
L["π‘ Laptop telemetry"] --> S["π wifi-diag-schema"]
S --> CLS["π’ LightGBM 10-class classifier"]
S --> ANO["π PyOD IForest anomaly detector"]
CLS --> V["π Verdict + EvidenceItems"]
ANO --> V
V --> N["π¬ Anthropic Haiku 4.5 narrator"]
N --> UI["π₯οΈ Gradio Live tab + Agent CLI"]
style CLS fill:#3498db,stroke:#1b4f72,stroke-width:3px,color:#fff
style ANO fill:#3498db,stroke:#1b4f72,stroke-width:3px,color:#fff
style V fill:#2ecc71,stroke:#196f3d,stroke-width:2px,color:#fff
Trained models (blue) sit at the visual gravity center of the pipeline. The LLM narrator (green) is downstream β it explains what the classifier and anomaly detector found, with citations to specific telemetry fields. This is not a GPT wrapper.
Try it live
π Live demo on Hugging Face Spaces
AI Internet Diagnostic β Model Repo
LightGBM 10-class disconnect classifier + PyOD anomaly detector + reproducible synthetic-data generator for the AI Internet Diagnostic project.
This is one of four repos in the project topology (D-10 / D-11):
ai-internet-diagnostic-spaceβ Hugging Face Space (Phase 3)ai-internet-diagnostic-model(this repo) β model artefacts + synthetic-data generator (Phase 1β2)ai-internet-diagnostic-agentβ cross-platform local telemetry agent (Phase 4)wifi-diag-schemaβ Pydantic wire-format schema, published to PyPI (Phase 1)
Quickstart
uv sync --all-extras --dev
make synth # regenerate data/train.parquet (100k) + data/eval.parquet (20k); <30s
make test # run unit tests
Reproducibility
make synth regenerates train + eval Parquet byte-identically from fixed master seeds (D-08):
MASTER_TRAIN_SEED = 20260501βdata/train.parquet(10,000 samples Γ 10 classes)MASTER_EVAL_SEED = 20260502βdata/eval.parquet(2,000 samples Γ 10 classes)
Per-class PCG64 sub-streams via SeedSequence.spawn() (RESEARCH Pattern 4) guarantee determinism.
Datasheet
See DATASHEET.md for the Gebru-format dataset card. Per CONTEXT.md D-09, the Limitations section leads with the synthetic-vs-real gap; the Reality Anchor placeholder is reserved for Phase 4 dogfood data.
License
Apache-2.0

