Instructions to use sledgedev/rampart-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use sledgedev/rampart-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir rampart-mlx sledgedev/rampart-mlx
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
rampart-mlx
A 4-bit MLX port of nationaldesignstudio/rampart
— a tiny (~18.5M param) MiniLM-L6-H384 BERT that detects PII via a 35-label BIO
head (17 entity types). This version runs natively on Apple Silicon through
MLX and
mlx-swift for on-device inference.
What this is
The upstream model ships only as 4-bit ONNX (MatMulNBits + INT8
embeddings) — there is no PyTorch/safetensors checkpoint. This repo was produced
by recovering float weights from the ONNX graph, verifying an MLX
reimplementation reproduces ONNX Runtime exactly (100% token-label match,
max logit diff 1e-5), then re-quantizing natively with MLX.
| Size | Token-label agreement vs ONNX-q4 | |
|---|---|---|
| MLX fp32 (reconstructed) | 28 MB | 100.0% |
| MLX 4-bit / group 64 (this repo) | 11.6 MB | 94.5% entity-type & PII-vs-O |
The small gap is from a second quantization on top of the already-q4 source (true fp32 weights were never available upstream).
Files
model.safetensors— quantized weights (MLX format:weight/scales/biases, 4-bit, group size 64)config.json— BERT config +quantizationblock (group_size: 64,bits: 4)vocab.txt,tokenizer.json,tokenizer_config.json,special_tokens_map.jsonrampart_mlx.py—BertForTokenClassificationin MLX (the model module)pii_rules.py— deterministic regex/checksum layer (see below)demo.py— runnable end-to-end demo (neural model + deterministic layer)
Quick demo
pip install mlx transformers huggingface_hub
huggingface-cli download sledgedev/rampart-mlx --local-dir rampart-mlx
cd rampart-mlx
python demo.py # interactive prompt
python demo.py "my email is a@b.com and ssn 078-05-1120"
echo "card 4111 1111 1111 1111" | python demo.py
my email is a@b.com and ssn 078-05-1120
EMAIL a@b.com ·rule
SSN 078-05-1120 ·rule
→ my email is [EMAIL] and ssn [SSN]
The deterministic layer
The shipped weights are the neural half of Rampart only. Per the upstream
whitepaper, the full system pairs the model with a deterministic layer of
regexes + checksum/structural validators that is the system of record for the
classes the model alone is weak on. pii_rules.py reimplements that layer:
- CREDIT_CARD — 13–19 digit runs validated by the Luhn checksum (separator-agnostic)
- SSN —
NNN-NN-NNNNwith structural rules (rejects area000/666/9xx, group/serial00/0000) - EMAIL / URL / IP_ADDRESS — pattern match (structure lives in the punctuation)
demo.py unions the model's spans with these, the deterministic layer winning on
overlap — so e.g. an SSN the model reads as PHONE is corrected to SSN. The
demo works on character offsets, so output preserves the original casing.
Usage (Swift / mlx-swift)
A complete mlx-swift implementation (BERT module, WordPiece tokenizer, PII span extraction, and CLI) is available in the conversion project. Example output:
$ rampart-cli ./rampart-mlx "Email sarah.lee@acme.io or call (650) 555-2020. Passport A1234567."
EMAIL sarah.lee@acme.io
PHONE (650)555-2020
PASSPORT a1234567
Note: a bare SwiftPM CLI binary can't locate MLX's Metal library; copy
mlx.metallib(from the pipmlxwheel) next to the binary, or run inside an Xcode app target where Metal resources are bundled automatically.
Caveats
- The neural model alone is weak on SSNs/cards (it may read an SSN as
PHONE) — this matches ONNX Runtime. The bundledpii_rules.pydeterministic layer is the system of record for those classes and corrects them indemo.py. demo.pyuses tokenizer character offsets, so output keeps the original casing; the model itself is BERT-uncased.- This repo contains the on-device neural component; it is not the complete upstream redaction product. Use it accordingly.
Attribution
- Original model:
nationaldesignstudio/rampart(CC-BY-4.0) - Base:
nreimers/MiniLM-L6-H384-uncased - Training data:
ai4privacy/pii-masking-openpii-1.5m
This derivative is released under the same CC-BY-4.0 license.
- Downloads last month
- 121
Quantized
Model tree for sledgedev/rampart-mlx
Base model
nreimers/MiniLM-L6-H384-uncased