tanaos/synthetic-topic-classification-dataset-v1
Viewer • Updated • 10.2k • 31
How to use tanaos/tanaos-topic-classification-v1 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="tanaos/tanaos-topic-classification-v1") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("tanaos/tanaos-topic-classification-v1")
model = AutoModelForSequenceClassification.from_pretrained("tanaos/tanaos-topic-classification-v1")
This model was created by Tanaos with the Artifex Python library.
This is a topic classification model based on FacebookAI/roberta-base and fine-tuned on a synthetic dataset to classify text into one of 15 different intent categories:
| Topic | Description |
|---|---|
politics |
elections, policies, scandals, ideology. |
health |
physical health, mental health, fitness, diets, medical advice. |
technology |
gadgets, software, AI, cybersecurity. |
entertainment |
movies, TV shows, music, celebrities, streaming platforms. |
money_finance |
investing, budgeting, crypto, real estate. |
relationships_dating |
romance, breakups, marriage, family drama. |
education_learning |
schools, universities, self-study, online courses., |
work_careers |
job hunting, workplace culture, remote work, career advice. |
science |
research, space, climate, biology, physics, chemistry and the scientific method. |
society_culture |
identity, inequality, norms, language, and society. |
gaming |
video games, esports, hardware, mods, and gaming culture. |
lifestyle_hobbies |
travel, food, fashion, DIY, productivity systems. |
sports |
teams, athletes, events, scores, and sports culture. |
automotive |
cars, motorcycles, reviews, maintenance, and industry news. |
other |
miscellaneous topics not covered by the other categories. |
Use this model through the Artifex library:
install Artifex with
pip install artifex
use the model with
from artifex import Artifex
topic_classification = Artifex().topic_classification()
topic = topic_classification("What do you think about the latest AI advancements?")
print(topic)
# >>> [{'label': 'technology', 'score': 0.9910}]
FacebookAI/roberta-baseThis model was trained using the Artifex Python library
pip install artifex
by providing the following instructions and generating 10,000 synthetic training samples:
from artifex import Artifex
topic_classification = Artifex().topic_classification()
topic_classification.train(
domain="general",
classes={
"politics": "elections, policies, scandals, ideology",
"health": "physical health, mental health, fitness, diets, medical advice.",
"technology": "gadgets, software, AI, cybersecurity.",
"entertainment": "movies, TV shows, music, celebrities, streaming platforms.",
"money_finance": "investing, budgeting, crypto, real estate.",
"relationships_dating": "romance, breakups, marriage, family drama.",
"education_learning": "schools, universities, self-study, online courses.",
"work_careers": "job hunting, workplace culture, remote work, career advice.",
"science": "research, space, climate, biology, physics, chemistry and the scientific method.",
"society_culture": "identity, inequality, norms, language, and society.",
"gaming": "video games, esports, hardware, mods, and gaming culture.",
"lifestyle_hobbies": "travel, food, fashion, DIY, productivity systems.",
"sports": "teams, athletes, events, scores, and sports culture.",
"automotive": "cars, motorcycles, reviews, maintenance, and industry news.",
"other": "miscellaneous topics not covered by the other categories."
},
num_samples=10000
)
This model is intended to:
Not intended for:
Base model
FacebookAI/roberta-base