Arabic Named Entity Recognition with LoRA Fine-tuning

A fine-tuned BERT model for Arabic Named Entity Recognition (NER) using Low-Rank Adaptation (LoRA) on the AraBERT v2 base model.

Model Details

Model Description

This model is a fine-tuned version of aubmindlab/bert-base-arabertv2 for token classification tasks, specifically Arabic Named Entity Recognition. The model uses LoRA (Low-Rank Adaptation) for efficient fine-tuning, making it parameter-efficient while maintaining high performance.

The model can identify four types of entities in Arabic text:

  • PER (Person): Names of people

  • ORG (Organization): Companies, institutions, government bodies

  • LOC (Location): Cities, countries, geographical locations

  • MISC (Miscellaneous): Other named entities

  • Developed by: [Diaa Essam]

  • Model type: Token Classification (NER)

  • Language(s) (NLP): Arabic (ar)

  • License: MIT

  • Finetuned from model: aubmindlab/bert-base-arabertv2

Model Sources

Uses

Direct Use

The model can be directly used for Arabic Named Entity Recognition tasks without additional fine-tuning. It's suitable for:

  • Extracting named entities from Arabic news articles
  • Information extraction from Arabic documents
  • Arabic text analysis and understanding
  • Building Arabic NLP pipelines

Downstream Use

The model can be further fine-tuned for:

  • Domain-specific NER tasks (medical, legal, financial Arabic text)
  • Custom entity types beyond the standard four categories
  • Transfer learning for related Arabic NLP tasks

Out-of-Scope Use

  • Non-Arabic text (the model is trained exclusively on Arabic)
  • Sentiment analysis or other non-NER tasks
  • Real-time applications requiring sub-millisecond latency without optimization

Bias, Risks, and Limitations

  • The model's performance may vary across different Arabic dialects and writing styles
  • Entity recognition accuracy depends on text quality and may be lower for informal or dialectal Arabic
  • The model may underperform on domain-specific jargon not present in the training data
  • MISC entities have lower representation in the training data, though class weighting was applied to mitigate this

Recommendations

Users should be aware of the model's limitations regarding:

  • Dialectal variations in Arabic text
  • Domain-specific terminology
  • The need for post-processing to handle multi-word entities correctly
  • Potential biases in the training data reflecting the source corpus

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForTokenClassification
from peft import PeftModel
import torch

# Load tokenizer and base model
tokenizer = AutoTokenizer.from_pretrained("aubmindlab/bert-base-arabertv2")
base_model = AutoModelForTokenClassification.from_pretrained(
    "aubmindlab/bert-base-arabertv2",
    num_labels=9
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "[checkpoint path]")
model = model.merge_and_unload()

# Prepare input
text = "ู…ุญู…ุฏ ูŠุนู…ู„ ููŠ ุดุฑูƒุฉ ุฌูˆุฌู„ ููŠ ุงู„ู‚ุงู‡ุฑุฉ"
tokens = text.split()

inputs = tokenizer(
    tokens,
    is_split_into_words=True,
    return_tensors="pt",
    truncation=True
)

# Predict
model.eval()
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.argmax(outputs.logits, dim=-1)

# Map predictions to labels
id_to_tag = {0: 'B-LOC', 1: 'B-MISC', 2: 'B-ORG', 3: 'B-PER', 
             4: 'I-LOC', 5: 'I-MISC', 6: 'I-ORG', 7: 'I-PER', 8: 'O'}

word_ids = inputs.word_ids(batch_index=0)
results = []
previous_word_idx = None

for idx, word_idx in enumerate(word_ids):
    if word_idx is not None and word_idx != previous_word_idx:
        pred_label = id_to_tag[predictions[0][idx].item()]
        results.append((tokens[word_idx], pred_label))
    previous_word_idx = word_idx

print(results)

Training Details

Training Data

The model was trained on the iSemantics/conllpp-ner-ar dataset, which is an Arabic adaptation of the CoNLL++ NER dataset. The dataset contains Arabic text annotated with four entity types (PER, ORG, LOC, MISC) using the IOB2 tagging scheme.

Training data composition:

  • Training samples: Combined train + validation sets for final training
  • Test samples: Held-out test set for evaluation
  • Entity types: 4 (PER, ORG, LOC, MISC) with 9 labels including IOB tags and 'O'

Training Procedure

Preprocessing

  • Text tokenization using AraBERT v2 tokenizer
  • Token alignment for subword tokenization
  • Maximum sequence length: 128 tokens
  • Special tokens and subword continuations labeled with -100 (ignored in loss calculation)

Training Hyperparameters

LoRA Configuration:

  • LoRA rank (r): 32
  • LoRA alpha: 64
  • LoRA dropout: 0.05
  • Target modules: query, value, key, dense layers
  • Trainable parameters: 3.7987% of total model parameters

Training Arguments:

  • Training regime: fp16 mixed precision (when GPU available)
  • Learning rate: 1e-4
  • Batch size: 32 (training), 64 (evaluation)
  • Number of epochs: 70
  • Weight decay: 0.01
  • Warmup ratio: 0.15
  • Optimizer: AdamW
  • LR scheduler: Cosine
  • Label smoothing: 0.1
  • Gradient accumulation steps: 1

Class Weighting: Custom weighted loss function applied with the following weights:

  • B-LOC, I-LOC, B-ORG, I-ORG, B-PER, I-PER: 1.0
  • B-MISC, I-MISC: 2.5 (increased to handle class imbalance)
  • O (outside): 0.5 (decreased to focus on entities)

Speeds, Sizes, Times

  • Training time: ~0.33-0.5 hours (with GPU acceleration)
  • Model size: Base model + LoRA adapter (~550MB total)
  • Throughput: Varies by hardware; optimized for GPU inference

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on the test split of the iSemantics/conllpp-ner-ar dataset.

Metrics

Evaluation metrics computed using the seqeval library:

  • Precision: Proportion of predicted entities that are correct
  • Recall: Proportion of true entities that are identified
  • F1 Score: Harmonic mean of precision and recall
  • Accuracy: Token-level accuracy

Results

Overall Performance:

  • F1 Score: 0.850601
  • Precision: 0.839822
  • Recall: 0.861660
  • Accuracy: 0.9460

Per-Entity Performance: Results show strong performance across all entity types, with particularly good results on:

  • PER (Person) entities
  • LOC (Location) entities
  • ORG (Organization) entities
  • MISC entities (improved through class weighting)

See the detailed classification report in the training logs for complete per-class metrics.

Technical Specifications

Model Architecture and Objective

  • Base Architecture: BERT (Bidirectional Encoder Representations from Transformers)
  • Specific Model: AraBERT v2 base
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Task: Token Classification (Named Entity Recognition)
  • Output: 9-class classification (B-PER, I-PER, B-ORG, I-ORG, B-LOC, I-LOC, B-MISC, I-MISC, O)

Compute Infrastructure

Hardware

  • GPU-accelerated training (CUDA-enabled)
  • Optimized for modern GPUs (tested on various CUDA-compatible devices)

Software

  • Framework: PyTorch with Transformers and PEFT libraries
  • Key Libraries:
    • transformers (Hugging Face)
    • peft (Parameter-Efficient Fine-Tuning)
    • seqeval (evaluation metrics)
    • datasets (Hugging Face)

Framework Versions

  • PEFT: 0.16.0
  • Transformers: 4.x
  • PyTorch: 2.x
  • Python: 3.8+

Citation

If you use this model, please cite:

BibTeX:

@misc{arabic-ner-lora,
  author = {Diaa Eldin Essam Zaki},
  title = {Arabic Named Entity Recognition with LoRA Fine-tuning},
  year = {2025},
  publisher = {HuggingFace},
}

Glossary

  • NER: Named Entity Recognition - the task of identifying and classifying named entities in text
  • LoRA: Low-Rank Adaptation - a parameter-efficient fine-tuning method that adds trainable rank decomposition matrices
  • IOB2: Inside-Outside-Beginning tagging scheme for sequence labeling
  • AraBERT: Arabic BERT model pre-trained on large Arabic corpora
  • Token Classification: Assigning a label to each token in a sequence

Model Card Authors

Diaa Essam

Model Card Contact

diaaesam123@gmail.com

Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Diaa-Essam/arabert-v2-ner-lora

Adapter
(1)
this model

Dataset used to train Diaa-Essam/arabert-v2-ner-lora