task-classifier-mini-v3

task-classifier-mini-v3 is an efficient, lightweight binary sequence classification model designed to identify texts that contain task statements (i.e., to be peformed in a work role). Built on top of prajjwal1/bert-tiny, it is optimized for high-speed, high-throughput filtering pipelines.

This particular version is an improved iteration of task-classifier-mini-improved2, fine-tuned on more curated examples from a large job postings corpus. We include validation results below.

Basic Usage

You can easily use this model with the standard Hugging Face text-classification pipeline.

from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer

model_name = "loyoladatamining/task-classifier-mini-v3"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name, max_length=64, truncation=True)

# Create text classification pipeline
nlp = pipeline(
    "text-classification", 
    model=model, 
    tokenizer=tokenizer,
    max_length=64,
    truncation=True
)

# Inference
text = "Manage and maintain the internal database servers on a weekly basis."
result = nlp(text)
print(result)

Output Format

The model returns a list containing a single classification result with the predicted binary label and its associated confidence score:

[
  {
    "label": "LABEL_1",
    "score": 0.9845
  }
]

Label Mapping

  • LABEL_0: The text is not a valid task statement.
  • LABEL_1: The text is a task statement.

Evaluation

The performance of task-classifier-mini-v3 was evaluated against the previous iteration (task-classifier-mini-improved2) using the loyoladatamining/usajobs_validation dataset.

This model demonstrates a significant improvement on the task portion of the validation set:

Model Accuracy F1
task-classifier-mini-improved2 0.8358 0.8253
task-classifier-mini-v3 0.9583 0.9585

Citation

If you find this model useful in your work, please consider citing:

@article{meisenbacher2025extracting,
  title={Extracting O* NET Features from the NLx Corpus to Build Public Use Aggregate Labor Market Data},
  author={Meisenbacher, Stephen and Nestorov, Svetlozar and Norlander, Peter},
  year={2025}
}
Downloads last month
262
Safetensors
Model size
4.39M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for loyoladatamining/task-classifier-mini-v3

Finetuned
(1)
this model