You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

WMSteer-A: Auditing AudioSeal in Activation Space

A representation-guided universal removal attack on AudioSeal — and an activation-forensics defense that catches it.

TL;DR AudioSeal's detector encodes "watermarked vs. clean" as a single, perfectly linearly separable concept (linear-probe AUC = 1.000 with 100 probe clips). Contrastive activation analysis + back-projection yields a universal waveform perturbation $\delta_{\rm univ}$ that flips 93% of attacked clips below AudioSeal's default decision threshold at ~18 dB SI-SDR. The attack transfers near-identically across four detectors retrained from the public weights. A small MLP on detector activations (WMShield) detects the attack at TPR@1%FPR = 0.993.

📄 Paper: wmsteer_a.pdf (10 pages: 8 main + 1 appendix + 1 references)

Headline numbers

Setting TPR@5%FPR (att) frac p<0.5 (default rule) δ SI-SDR
$D_A$ no channel 0.379 [0.285, 0.465] 1.00 18.1 dB
$D_A$ + AAC@64k chan 0.986 [0.910, 1.000] 0.93
$D_B$ s=111 no channel 0.07 (TPR@1%FPR) 1.00
$D_B$ s=222 no channel 0.07 1.00
$D_B$ s=444 2× LR 0.075 1.00

WMShield defense: TPR@1%FPR = 0.993, TPR@5%FPR = 1.000.

Method (one figure)

Method

Offline, the attacker watermarks 100 LibriSpeech probe clips with the public AudioSeal generator $G$, hooks the detector $D_A$'s encoder to extract a contrastive watermark direction $\bar v = \mathbb{E}[h_{\rm wm}] - \mathbb{E}[h_{\rm clean}]$, and back-projects it through $D_A$ via Adam to obtain a single universal waveform $\delta_{\rm univ}$ ($|\delta|\infty \le \varepsilon$). At attack time, $\delta{\rm univ}$ is added to any watermarked clip; the verifier's unmodified detector $D_V$ classifies the result as clean.

Repo contents

Path What
wmsteer_a.pdf The paper (10 pages, paper/ source available on request).
figures/ 12 figures used in the paper (PDF).
src/ Full PyTorch implementation, ROCm-compatible.
scripts/run.sh Wrapper that bakes ROCm/MIOpen environment fixes.
RESULTS.md Aggregated experiment summary.
LITERATURE_SURVEY.md Background literature notes.

Reproducing

git clone https://github.com/facebookresearch/audioseal
pip install audioseal pesq pyloudnorm soundfile librosa scipy datasets matplotlib
# Download 400 LibriSpeech-test-clean clips
python scripts/fetch_libri.py --n 400
# Run the kill experiment (~1 GPU-hour on 1× MI210 / A100)
scripts/run.sh -m src.block1_kill --probe-n 100 --heldout-n 200 --rank 4 --eps 0.01 --n-steps 600
# Bootstrap CIs + multi-FPR
scripts/run.sh -m src.post_analysis --n-bootstrap 1000
# Cross-detector transfer
scripts/run.sh -m src.block7_transfer
# Baselines (PGD, UAP, FFF, controls)
scripts/run.sh -m src.block5_baselines
# Defense (WMShield)
scripts/run.sh -m src.block6_defense

Key design choices and ROCm notes

  • Strip weight_norm parametrization on CPU before moving AudioSeal to GPU. MIOpen 6.3 on gfx90a crashes on weight-norm-wrapped 1D conv.
  • Disable TorchInductor (TORCHDYNAMO_DISABLE=1) on ROCm — convs go through a path that miopen kernel cache cannot resolve.
  • MIOpen RNN backward requires model.train() even with frozen weights: AudioSeal's encoder uses LSTMs; toggle via context manager.
  • Redirect MIOpen kernel cache via MIOPEN_USER_DB_PATH (system path is read-only); symlink gfx90a*.{tn,ktn}.model into the user cache.
  • All baked into scripts/run.sh.

Limitations / scope

  • White-box on the public AudioSeal generator $G$ only; truly private generators are out of scope.
  • AudioSeal 0.2 16-bit, LibriSpeech test-clean only. Cross-corpus and multilingual extensions are open.
  • We do not attempt forgery (universal insertion of a specific message); the per-bit message head is multi-axis and structurally distinct.
  • WMShield is a non-adaptive defender; the attacker–shield game is unmeasured.

Ethics and disclosure

This is a security audit of a deployed watermarking system using only the publicly released weights and publicly available speech. We do not target any specific deployed service. We disclose findings to the AudioSeal authors prior to publication and propose WMShield as a concrete mitigation.

Citation

@article{wmsteer2026,
  title={Auditing AudioSeal in Activation Space:
         A Linear Watermark Direction Yields a Universal,
         Cross-Detector Removal Perturbation, and Its Defense},
  author={Anonymous},
  year={2026},
  note={Submitted to INTERSPEECH 2027}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support