is_pay
is_pay is a fine-tuned, lightweight sequence classification model used to predict whether a given text string contains wage or salary information. It was fine-tuned from lyeonii/bert-tiny, making it highly efficient for high-throughput filtering pipelines.
Basic Usage
You can deploy this model using the standard Hugging Face text-classification pipeline.
from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer
model_name = "loyoladatamining/is_pay"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name, max_length=64, truncation=True)
# Create text classification pipeline
nlp = pipeline(
"text-classification",
model=model,
tokenizer=tokenizer,
max_length=64,
truncation=True
)
# Inference
text = "The starting salary for this position is $75,000 per year."
result = nlp(text)
print(result)
Output Format
The model returns a list containing a dictionary with the predicted binary class label and its corresponding confidence score:
[
{
"label": "LABEL_1",
"score": 0.9942
}
]
Label Mapping
LABEL_0: The text does not contain wage or salary information.LABEL_1: The text contains wage or salary information.
Citation
If you find is_pay useful in your work, please consider citing:
@article{meisenbacher2025extracting,
title={Extracting O* NET Features from the NLx Corpus to Build Public Use Aggregate Labor Market Data},
author={Meisenbacher, Stephen and Nestorov, Svetlozar and Norlander, Peter},
year={2025}
}
- Downloads last month
- 90
Model tree for loyoladatamining/is_pay
Base model
lyeonii/bert-tiny