Privacy Comparator

A learned model for pairwise comparison of privacy strength between messages.

Model Details

Model Description

Privacy Comparator is a learned model that compares two messages and determines which provides stronger protection of personal or sensitive information.

Given two inputs:

A: message  
B: message

the model outputs:

A      message A is more privacy-preserving  
B      message B is more privacy-preserving  
SAME   messages offer the same level of privacy protection

The model performs relative privacy comparison and can be applied to arbitrary message pairs, regardless of how they were generated.

It does not:

detect PII
assign absolute privacy scores
generate redactions

Instead, it learns a preference relation over messages in terms of privacy strength.

Base Model

Finetuned from: Qwen/Qwen2.5-7B-Instruct

Implemented as a LoRA adapter.

License

This adapter inherits the license constraints of the base model.

Uses

Intended Use

Privacy-preserving text comparison
Ranking anonymization strategies
Evaluating relative disclosure risk

For example, when multiple transformation strategies are applied to the same input:

m_i = τ(x; a_i)

where:

x is the original message
a_i is a transformation strategy (e.g., redact, abstract, retain sensitive spans)
τ applies the chosen strategy to produce a privacy-preserving version

Example:

Original message:

Lucy lives at 139 Tremont St in Boston.

Different strategies may produce:

m₁: [NAME1] lives at [ADDRESS1] in [CITY1].
m₂: A person lives at a residential address in a major city in U.S.
m₃: A person lives at [ADDRESS1] in Boston.

The comparator can rank such variants based on which better protects sensitive information.

For more details on the transformation framework, please refer to the associated paper.

Out-of-Scope Use

This model is not intended for:

PII detection
Safety moderation
Utility evaluation
Generating anonymized text

It performs relative comparison only.

Training Details

LoRA rank: 8
Learning rate: 1e-4
Epochs: 2
Context length: 2048
Global batch size: 2048

Training performed using Fireworks AI.

Training Data

This model is fine-tuned via supervised fine-tuning (SFT) with LoRA on pairwise privacy-preference comparisons.

Training labels are generated using a teacher model (OpenAI o3) on ShareGPT90K-derived privacy-variant pairs.
As described in the paper, o3 was selected based on its alignment with human ground truth under high-consensus cases.

In addition, we release a human-labeled evaluation set of 150 A/B pairs.
Each pair is annotated by at least 5 qualified participants (52 unique participants total), with provided consensus labels and consensus_ratio.

For details on data construction, model selection, and annotation procedures, please refer to the paper.

Released Dataset (Human Ground Truth)

We release a human-labeled dataset of 150 pairwise privacy-preference comparisons.

Each JSONL entry contains:

survey_id, conversation_id, pair_index
answers: anonymized participant votes (participant_1, participant_2, ...)
consensus, consensus_ratio
message_A, message_B

Participant Privacy

All participant identifiers are anonymized. No Prolific IDs or direct participant identifiers are released.

Model Outputs

The model produces structured JSON decisions:

{
  "reason": "...",
  "response": "A" | "B" | "SAME"
}

Resources

Paper: OpenReview
Code: Operationalize Data Minimization

For full details of the transformation framework and action search procedure, please refer to the paper.

Downloads last month: -

Model tree for peach-lab/privacy-comparator

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Adapter

(2139)

this model