rampart-mlx

A 4-bit MLX port of nationaldesignstudio/rampart — a tiny (~18.5M param) MiniLM-L6-H384 BERT that detects PII via a 35-label BIO head (17 entity types). This version runs natively on Apple Silicon through MLX and mlx-swift for on-device inference.

What this is

The upstream model ships only as 4-bit ONNX (MatMulNBits + INT8 embeddings) — there is no PyTorch/safetensors checkpoint. This repo was produced by recovering float weights from the ONNX graph, verifying an MLX reimplementation reproduces ONNX Runtime exactly (100% token-label match, max logit diff 1e-5), then re-quantizing natively with MLX.

	Size	Token-label agreement vs ONNX-q4
MLX fp32 (reconstructed)	28 MB	100.0%
MLX 4-bit / group 64 (this repo)	11.6 MB	94.5% entity-type & PII-vs-O

The small gap is from a second quantization on top of the already-q4 source (true fp32 weights were never available upstream).

Files

model.safetensors — quantized weights (MLX format: weight/scales/biases, 4-bit, group size 64)
config.json — BERT config + quantization block (group_size: 64, bits: 4)
vocab.txt, tokenizer.json, tokenizer_config.json, special_tokens_map.json
rampart_mlx.py — BertForTokenClassification in MLX (the model module)
pii_rules.py — deterministic regex/checksum layer (see below)
demo.py — runnable end-to-end demo (neural model + deterministic layer)

Quick demo

pip install mlx transformers huggingface_hub
huggingface-cli download sledgedev/rampart-mlx --local-dir rampart-mlx
cd rampart-mlx

python demo.py                                   # interactive prompt
python demo.py "my email is a@b.com and ssn 078-05-1120"
echo "card 4111 1111 1111 1111" | python demo.py

my email is a@b.com and ssn 078-05-1120
  EMAIL          a@b.com        ·rule
  SSN            078-05-1120    ·rule
  → my email is [EMAIL] and ssn [SSN]

The deterministic layer

The shipped weights are the neural half of Rampart only. Per the upstream whitepaper, the full system pairs the model with a deterministic layer of regexes + checksum/structural validators that is the system of record for the classes the model alone is weak on. pii_rules.py reimplements that layer:

CREDIT_CARD — 13–19 digit runs validated by the Luhn checksum (separator-agnostic)
SSN — NNN-NN-NNNN with structural rules (rejects area 000/666/9xx, group/serial 00/0000)
EMAIL / URL / IP_ADDRESS — pattern match (structure lives in the punctuation)

demo.py unions the model's spans with these, the deterministic layer winning on overlap — so e.g. an SSN the model reads as PHONE is corrected to SSN. The demo works on character offsets, so output preserves the original casing.

Usage (Swift / mlx-swift)

A complete mlx-swift implementation (BERT module, WordPiece tokenizer, PII span extraction, and CLI) is available in the conversion project. Example output:

$ rampart-cli ./rampart-mlx "Email sarah.lee@acme.io or call (650) 555-2020. Passport A1234567."
  EMAIL      sarah.lee@acme.io
  PHONE      (650)555-2020
  PASSPORT   a1234567

Note: a bare SwiftPM CLI binary can't locate MLX's Metal library; copy mlx.metallib (from the pip mlx wheel) next to the binary, or run inside an Xcode app target where Metal resources are bundled automatically.

Caveats

The neural model alone is weak on SSNs/cards (it may read an SSN as PHONE) — this matches ONNX Runtime. The bundled pii_rules.py deterministic layer is the system of record for those classes and corrects them in demo.py.
demo.py uses tokenizer character offsets, so output keeps the original casing; the model itself is BERT-uncased.
This repo contains the on-device neural component; it is not the complete upstream redaction product. Use it accordingly.

Attribution

Original model: nationaldesignstudio/rampart (CC-BY-4.0)
Base: nreimers/MiniLM-L6-H384-uncased
Training data: ai4privacy/pii-masking-openpii-1.5m

This derivative is released under the same CC-BY-4.0 license.

Downloads last month: 121

Safetensors

Model size

2.91M params

Tensor type

F32

U32

MLX

Hardware compatibility

Quantized

Model tree for sledgedev/rampart-mlx

Base model

nreimers/MiniLM-L6-H384-uncased

Quantized

nationaldesignstudio/rampart

Quantized

(1)

this model

sledgedev
/

rampart-mlx

rampart-mlx

What this is

Files

Quick demo

The deterministic layer

Usage (Swift / mlx-swift)

Caveats

Attribution

Model tree for sledgedev/rampart-mlx

Dataset used to train sledgedev/rampart-mlx