CRC-Screen: Certified DNA-Synthesis Hazard Screening Under Taxonomic Shift
Abstract
A calibrated screening system using multiple sequence similarity signals and conformal risk control achieves guaranteed low false flag rates for DNA synthesis hazard detection across taxonomic families.
DNA-synthesis providers screen incoming orders by searching the requested sequence against curated hazard lists. We show that this baseline collapses to a 100% false-flag rate when the hazardous sequence comes from a taxonomic family absent from the reference set: under Conformal Risk Control's certified miss-rate constraint, a low-discrimination signal forces the threshold below the entire test-benign mass. We compose three signals derived from a synthesis order's public annotation: k-mer Jaccard similarity to known toxins, the trimmed-mean score of a five-LLM judge panel, and cosine similarity to clustered embedding centroids. Fused under a monotone logistic aggregator and calibrated by Conformal Risk Control, the resulting screener certifies E[FNR] le α. Across ten leave-one-taxonomic-family-out folds at α=0.05 on UniProt KW-0800 reviewed toxins, the calibrated screener achieves 0% test miss rate on every fold and 0% test false-flag rate on nine of ten folds. The bound's finite-sample slack 1/(n_{cal}+1) caps the certifiable miss rate at 1.77% on our 200-hazard subsample; reaching procurement-grade α=10^{-3} requires an 18times larger calibration set, which the full reviewed UniProt KW-0800 corpus is large enough to deliver. The binding constraint on certifiable DNA-synthesis screening is calibration data, not algorithms. Code: https://github.com/najmulhasan-code/crc-screen
Get this paper in your agent:
hf papers read 2605.00074 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper