YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Description

Afroscope-model is a language identification (LID) model from the AfroScope project, fine-tuned on Serengeti, supporting 713 African languages.

For more details on the supported languages and performance, as well as significant changes from previous versions, please refer to LINK_HERE.

Dataset: dataset
Repository: github
Paper: Arxiv

How to use

Here is how to use this model to detect the language of a given text:

from transformers import pipeline


afroscope_model = pipeline("text-classification", model='UBC-NLP/afroscope-model')

input_text="Ninyepuní íne εtɩε, bε ewǐe Jesi ɔnʋ lεfε kʋkʋkpɔ cε."

result = afroscope_model(input_text)

# Extract the label and score from the first result
language = result[0]['label']
score = result[0]['score']

print(f"detected langauge: {language}\tscore: {round(score*100, 2)}")

Citation

@article{kwon2026afroscope,
  title={AfroScope: A Framework for Studying the Linguistic Landscape of Africa},
  author={Kwon, Sang Yun and Elmadany, AbdelRahim and Abdul-Mageed, Muhammad},
  journal={arXiv preprint arXiv:2601.13346},
  year={2026}
}

Downloads last month: 14

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for UBC-NLP/afroscope-model

AfroScope: A Framework for Studying the Linguistic Landscape of Africa

Paper • 2601.13346 • Published 23 days ago