YAML Metadata Warning: empty or missing yaml metadata in repo card

Check out the documentation for more information.

Sentinel-D spaCy NER Model (Stage 1 โ€” NVD Parsing)

Model Details

  • Base Model: spaCy blank English (en_core_web_blank)
  • Task: Named Entity Recognition (NER)
  • Training Date: 2026-03-04T21:49:41.890810
  • Framework: spaCy 3.x
  • Training Data Size: 550 descriptions + 50-example test set
  • Training Epochs: 20
  • Dropout: 0.35

Custom NER Labels

  1. VERSION_RANGE: Semantic version strings or version constraints (e.g., "1.2.3", "< 2.0.0")
  2. API_SYMBOL: Method, class, or function names (e.g., "queryset.filter()", "X.509")
  3. BREAKING_CHANGE: References to incompatible API changes or deprecations
  4. FIX_ACTION: Specific remediation steps or upgrade instructions

Evaluation Metrics

Metric Value
Precision 0.9111
Recall 0.7885
F1 Score 0.8454
True Positives 41
False Positives 4
False Negatives 11

Usage

import spacy

nlp = spacy.load("./spacy-nvd-ner-v1")

text = "OpenSSL versions before 1.1.1n contain a buffer overflow in the X.509 verifier."
doc = nlp(text)

for ent in doc.ents:
    print(f"{ent.text} -> {ent.label_}")
    # Output:
    # 1.1.1n -> VERSION_RANGE
    # X.509 -> API_SYMBOL

Installation

  1. Extract the zip archive to your project directory
  2. Load the model using spaCy:
    import spacy
    nlp = spacy.load("./spacy-nvd-ner-v1")
    

Architecture

The model consists of:

  • Input Layer: Vectorized token representations
  • Hidden Layer: Feed-forward network with 0.35 dropout
  • Output Layer: 4-class NER tagger (softmax)

Training Configuration

  • Optimizer: SGD
  • Batch Size Range: 8-32 (compounding)
  • Training Data: Real NVD descriptions auto-annotated with GLiNER teacher model
  • Constraint: Exactly 50-example held-out test set (Master Document requirement)

Known Limitations

  • Model trained on NVD descriptions only; may not generalize to other security domains
  • Entity boundaries may not align perfectly with whitespace
  • Requires English text input

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support