Privacy Comparator

A learned model for pairwise comparison of privacy strength between messages.


Model Details

Model Description

Privacy Comparator is a learned model that compares two messages and determines which provides stronger protection of personal or sensitive information.

Given two inputs:

A: message  
B: message

the model outputs:

A      message A is more privacy-preserving  
B      message B is more privacy-preserving  
SAME   messages offer the same level of privacy protection

The model performs relative privacy comparison and can be applied to arbitrary message pairs, regardless of how they were generated.

It does not:

  • detect PII
  • assign absolute privacy scores
  • generate redactions

Instead, it learns a preference relation over messages in terms of privacy strength.


Base Model

Finetuned from: Qwen/Qwen2.5-7B-Instruct

Implemented as a LoRA adapter.


License

This adapter inherits the license constraints of the base model.


Uses

Intended Use

  • Privacy-preserving text comparison
  • Ranking anonymization strategies
  • Evaluating relative disclosure risk

For example, when multiple transformation strategies are applied to the same input:

m_i = τ(x; a_i)

where:

  • x is the original message
  • a_i is a transformation strategy (e.g., redact, abstract, retain sensitive spans)
  • τ applies the chosen strategy to produce a privacy-preserving version

Example:

Original message:

Lucy lives at 139 Tremont St in Boston.

Different strategies may produce:

m₁: [NAME1] lives at [ADDRESS1] in [CITY1].
m₂: A person lives at a residential address in a major city in U.S.
m₃: A person lives at [ADDRESS1] in Boston.

The comparator can rank such variants based on which better protects sensitive information.

For more details on the transformation framework, please refer to the associated paper.


Out-of-Scope Use

This model is not intended for:

  • PII detection
  • Safety moderation
  • Utility evaluation
  • Generating anonymized text

It performs relative comparison only.


Training Details

  • LoRA rank: 8
  • Learning rate: 1e-4
  • Epochs: 2
  • Context length: 2048
  • Global batch size: 2048

Training performed using Fireworks AI.

Training Data

This model is fine-tuned via supervised fine-tuning (SFT) with LoRA on pairwise privacy-preference comparisons.

Training labels are generated using a teacher model (OpenAI o3) on ShareGPT90K-derived privacy-variant pairs.
As described in the paper, o3 was selected based on its alignment with human ground truth under high-consensus cases.

In addition, we release a human-labeled evaluation set of 150 A/B pairs.
Each pair is annotated by at least 5 qualified participants (52 unique participants total), with provided consensus labels and consensus_ratio.

For details on data construction, model selection, and annotation procedures, please refer to the paper.


Released Dataset (Human Ground Truth)

We release a human-labeled dataset of 150 pairwise privacy-preference comparisons.

Each JSONL entry contains:

  • survey_id, conversation_id, pair_index
  • answers: anonymized participant votes (participant_1, participant_2, ...)
  • consensus, consensus_ratio
  • message_A, message_B

Participant Privacy

All participant identifiers are anonymized. No Prolific IDs or direct participant identifiers are released.


Model Outputs

The model produces structured JSON decisions:

{
  "reason": "...",
  "response": "A" | "B" | "SAME"
}

Resources

Paper: OpenReview
Code: Operationalize Data Minimization

For full details of the transformation framework and action search procedure, please refer to the paper.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for peach-lab/privacy-comparator

Base model

Qwen/Qwen2.5-7B
Adapter
(2139)
this model