OpenMed Privacy Filter (Multilingual) — MLX 8-bit

A native MLX port of OpenMed/privacy-filter-multilingual for fast, on-device fine-grained PII detection across 54 categories and 16 languages on Apple Silicon. This 8-bit affine-quantized artifact reduces download size and resident memory; for the full-precision sibling see OpenMed/privacy-filter-multilingual-mlx.

Family at a glance. Same architecture and training data, three runtimes:

PyTorch — OpenMed/privacy-filter-multilingual — CPU + CUDA.

MLX BF16 — OpenMed/privacy-filter-multilingual-mlx — Apple Silicon, full precision (~2.6 GB).

MLX 8-bit (this repo) — OpenMed/privacy-filter-multilingual-mlx-8bit — Apple Silicon, ~1.4 GB.

What it does

The model is a token classifier built on the OpenAI Privacy Filter architecture (openai_privacy_filter). It tags each token with a BIOES label across 54 PII span classes, then a Viterbi pass over the BIOES grammar yields clean entity spans. Languages covered: Arabic, Bengali, Chinese, Dutch, English, French, German, Hindi, Italian, Japanese, Korean, Portuguese, Spanish, Telugu, Turkish, Vietnamese.

For per-label accuracy, training recipe, and dataset details, see the base PyTorch checkpoint.

Architecture

Field	Value
Source model type	`openai_privacy_filter`
Source architecture	`OpenAIPrivacyFilterForTokenClassification`
Hidden size	640
Transformer layers	8
Attention	Grouped-Query (14 query heads / 2 KV heads, head_dim=64) with attention sinks
FFN	Sparse Mixture-of-Experts — 128 experts, top-4 routing, SwiGLU
Position encoding	YARN-scaled RoPE (`rope_theta=150_000`, factor=32)
Context length	131,072 tokens (initial 4,096)
Tokenizer	`o200k_base` (tiktoken) — vocab 200,064
Output head	Linear(640 → 217) with bias

File set

File	Size	Purpose
`weights.safetensors`	~1.4 GB	Model weights in OpenMed-MLX layout
`config.json`	~19 KB	Model + MLX runtime config
`id2label.json`	~5 KB	Numeric ID → BIOES label string
`openmed-mlx.json`	~1 KB	OpenMed MLX manifest (task, family, runtime hints)
`tokenizer.json`, `tokenizer_config.json`	~28 MB	Source tokenizer files (kept for reference)

The MLX runtime uses tiktoken o200k_base directly for tokenization; the tokenizer.json is kept so consumers can inspect or re-tokenize via transformers if desired.

Label space (54 categories)

Category	Typical examples
Identity	`FIRSTNAME`, `MIDDLENAME`, `LASTNAME`, `PREFIX`, `AGE`, `GENDER`, `SEX`, `EYECOLOR`, `HEIGHT`, `USERNAME`, `OCCUPATION`, `JOBTITLE`, `JOBDEPARTMENT`, `ORGANIZATION`, `USERAGENT`
Contact	`EMAIL`, `PHONE`, `URL`
Address	`STREET`, `BUILDINGNUMBER`, `SECONDARYADDRESS`, `CITY`, `COUNTY`, `STATE`, `ZIPCODE`, `GPSCOORDINATES`, `ORDINALDIRECTION`
Dates & time	`DATE`, `DATEOFBIRTH`, `TIME`
Government IDs	`SSN`
Financial	`ACCOUNTNAME`, `BANKACCOUNT`, `IBAN`, `BIC`, `CREDITCARD`, `CREDITCARDISSUER`, `CVV`, `PIN`, `MASKEDNUMBER`, `AMOUNT`, `CURRENCY`, `CURRENCYCODE`, `CURRENCYNAME`, `CURRENCYSYMBOL`
Crypto	`BITCOINADDRESS`, `ETHEREUMADDRESS`, `LITECOINADDRESS`
Vehicle	`VIN`, `VRM`
Digital	`IPADDRESS`, `MACADDRESS`, `IMEI`
Auth	`PASSWORD`

Quick start

With OpenMed — recommended

OpenMed gives you a single extract_pii() / deidentify() API that auto-selects MLX on Apple Silicon and PyTorch elsewhere — same code on every host.

pip install -U "openmed[mlx]"

from openmed import extract_pii, deidentify

text = (
 "Patient Sarah Johnson (DOB 03/15/1985), phone 415-555-0123, email sarah.johnson@example.com."
)

# Extract grouped entity spans (runs on MLX here, PyTorch fallback elsewhere)
result = extract_pii(text, model_name="OpenMed/privacy-filter-multilingual-mlx-8bit")
for ent in result.entities:
 print(f"{ent.label:30s} {ent.text!r} conf={ent.confidence:.2f}")

# De-identify
masked = deidentify(text, method="mask",
 model_name="OpenMed/privacy-filter-multilingual-mlx-8bit")
fake = deidentify(
 text,
 method="replace",
 model_name="OpenMed/privacy-filter-multilingual-mlx-8bit",
 consistent=True,
 seed=42, # deterministic locale-aware Faker surrogates
)

When MLX isn't available (Linux, Windows, Intel Mac, missing mlx package), this exact same call automatically falls back to the PyTorch checkpoint OpenMed/privacy-filter-multilingual with a one-time warning. Family-aware fallback: a Multilingual MLX request never substitutes an unrelated baseline.

Direct MLX usage (lower-level)

from huggingface_hub import snapshot_download
from openmed.mlx.inference import PrivacyFilterMLXPipeline

model_path = snapshot_download("OpenMed/privacy-filter-multilingual-mlx-8bit")
pipe = PrivacyFilterMLXPipeline(model_path)

print(pipe("Email me at alice.smith@example.com after 5pm."))
# [{'entity_group': 'EMAIL',
# 'score': 0.92,
# 'word': 'alice.smith@example.com',
# 'start': 12,
# 'end': 35}]

The pipeline returns a list of dicts with entity_group, score, word, start, and end (character offsets into the input string).

Hardware notes

Designed for Apple Silicon (M-series GPUs); CPU inference works but is slower.
Tested on macOS with mlx>=0.18. The MLX runtime in this repo is independent of mlx_lm (token classification, not causal LM).
Lower latency / smaller memory than the BF16 sibling.

Credits & Acknowledgements

This artifact wouldn't exist without two open-source releases — sincere thanks to both teams:

OpenAI for open-sourcing the Privacy Filter (architecture, modeling code, and opf training/eval CLI). The MLX port in this repo runs that same architecture under Apple's MLX framework.
AI4Privacy for releasing the multilingual PII masking datasets used to fine-tune the source PyTorch checkpoint: pii-masking-200k, pii-masking-400k, and open-pii-masking-500k-ai4privacy.

Additional thanks to Apple for MLX and the HuggingFace team for the model-distribution ecosystem.

License

Apache 2.0.

Downloads last month: 57

MLX

Hardware compatibility

Quantized

Model tree for OpenMed/privacy-filter-multilingual-mlx-8bit

Base model

openai/privacy-filter

Finetuned

OpenMed/privacy-filter-multilingual

Finetuned

(2)

this model

Datasets used to train OpenMed/privacy-filter-multilingual-mlx-8bit

Collection including OpenMed/privacy-filter-multilingual-mlx-8bit

OpenAI's privacy-filter fine0tuned models • 6 items • Updated May 6 • 10

URL: https://huggingface.co/OpenMed/privacy-filter-multilingual-mlx-8bit