rampart-mlx

A 4-bit MLX port of nationaldesignstudio/rampart — a tiny (~18.5M param) MiniLM-L6-H384 BERT that detects PII via a 35-label BIO head (17 entity types). This version runs natively on Apple Silicon through MLX and mlx-swift for on-device inference.

What this is

The upstream model ships only as 4-bit ONNX (MatMulNBits + INT8 embeddings) — there is no PyTorch/safetensors checkpoint. This repo was produced by recovering float weights from the ONNX graph, verifying an MLX reimplementation reproduces ONNX Runtime exactly (100% token-label match, max logit diff 1e-5), then re-quantizing natively with MLX.

Size Token-label agreement vs ONNX-q4
MLX fp32 (reconstructed) 28 MB 100.0%
MLX 4-bit / group 64 (this repo) 11.6 MB 94.5% entity-type & PII-vs-O

The small gap is from a second quantization on top of the already-q4 source (true fp32 weights were never available upstream).

Files

  • model.safetensors — quantized weights (MLX format: weight/scales/biases, 4-bit, group size 64)
  • config.json — BERT config + quantization block (group_size: 64, bits: 4)
  • vocab.txt, tokenizer.json, tokenizer_config.json, special_tokens_map.json
  • rampart_mlx.py — BertForTokenClassification in MLX (the model module)
  • pii_rules.py — deterministic regex/checksum layer (see below)
  • demo.py — runnable end-to-end demo (neural model + deterministic layer)

Quick demo

pip install mlx transformers huggingface_hub
huggingface-cli download sledgedev/rampart-mlx --local-dir rampart-mlx
cd rampart-mlx

python demo.py                                   # interactive prompt
python demo.py "my email is a@b.com and ssn 078-05-1120"
echo "card 4111 1111 1111 1111" | python demo.py
my email is a@b.com and ssn 078-05-1120
  EMAIL          a@b.com        ·rule
  SSN            078-05-1120    ·rule
  → my email is [EMAIL] and ssn [SSN]

The deterministic layer

The shipped weights are the neural half of Rampart only. Per the upstream whitepaper, the full system pairs the model with a deterministic layer of regexes + checksum/structural validators that is the system of record for the classes the model alone is weak on. pii_rules.py reimplements that layer:

  • CREDIT_CARD — 13–19 digit runs validated by the Luhn checksum (separator-agnostic)
  • SSN — NNN-NN-NNNN with structural rules (rejects area 000/666/9xx, group/serial 00/0000)
  • EMAIL / URL / IP_ADDRESS — pattern match (structure lives in the punctuation)

demo.py unions the model's spans with these, the deterministic layer winning on overlap — so e.g. an SSN the model reads as PHONE is corrected to SSN. The demo works on character offsets, so output preserves the original casing.

Usage (Swift / mlx-swift)

A complete mlx-swift implementation (BERT module, WordPiece tokenizer, PII span extraction, and CLI) is available in the conversion project. Example output:

$ rampart-cli ./rampart-mlx "Email sarah.lee@acme.io or call (650) 555-2020. Passport A1234567."
  EMAIL      sarah.lee@acme.io
  PHONE      (650)555-2020
  PASSPORT   a1234567

Note: a bare SwiftPM CLI binary can't locate MLX's Metal library; copy mlx.metallib (from the pip mlx wheel) next to the binary, or run inside an Xcode app target where Metal resources are bundled automatically.

Caveats

  • The neural model alone is weak on SSNs/cards (it may read an SSN as PHONE) — this matches ONNX Runtime. The bundled pii_rules.py deterministic layer is the system of record for those classes and corrects them in demo.py.
  • demo.py uses tokenizer character offsets, so output keeps the original casing; the model itself is BERT-uncased.
  • This repo contains the on-device neural component; it is not the complete upstream redaction product. Use it accordingly.

Attribution

This derivative is released under the same CC-BY-4.0 license.

Downloads last month
121
Safetensors
Model size
2.91M params
Tensor type
F32
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sledgedev/rampart-mlx

Quantized
(1)
this model

Dataset used to train sledgedev/rampart-mlx