YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
Sentinel-D spaCy NER Model (Stage 1 โ NVD Parsing)
Model Details
- Base Model: spaCy blank English (
en_core_web_blank) - Task: Named Entity Recognition (NER)
- Training Date: 2026-03-04T21:49:41.890810
- Framework: spaCy 3.x
- Training Data Size: 550 descriptions + 50-example test set
- Training Epochs: 20
- Dropout: 0.35
Custom NER Labels
- VERSION_RANGE: Semantic version strings or version constraints (e.g., "1.2.3", "< 2.0.0")
- API_SYMBOL: Method, class, or function names (e.g., "queryset.filter()", "X.509")
- BREAKING_CHANGE: References to incompatible API changes or deprecations
- FIX_ACTION: Specific remediation steps or upgrade instructions
Evaluation Metrics
| Metric | Value |
|---|---|
| Precision | 0.9111 |
| Recall | 0.7885 |
| F1 Score | 0.8454 |
| True Positives | 41 |
| False Positives | 4 |
| False Negatives | 11 |
Usage
import spacy
nlp = spacy.load("./spacy-nvd-ner-v1")
text = "OpenSSL versions before 1.1.1n contain a buffer overflow in the X.509 verifier."
doc = nlp(text)
for ent in doc.ents:
print(f"{ent.text} -> {ent.label_}")
# Output:
# 1.1.1n -> VERSION_RANGE
# X.509 -> API_SYMBOL
Installation
- Extract the zip archive to your project directory
- Load the model using spaCy:
import spacy nlp = spacy.load("./spacy-nvd-ner-v1")
Architecture
The model consists of:
- Input Layer: Vectorized token representations
- Hidden Layer: Feed-forward network with 0.35 dropout
- Output Layer: 4-class NER tagger (softmax)
Training Configuration
- Optimizer: SGD
- Batch Size Range: 8-32 (compounding)
- Training Data: Real NVD descriptions auto-annotated with GLiNER teacher model
- Constraint: Exactly 50-example held-out test set (Master Document requirement)
Known Limitations
- Model trained on NVD descriptions only; may not generalize to other security domains
- Entity boundaries may not align perfectly with whitespace
- Requires English text input
License
MIT
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support