ft-Malay-bert / README.md

rmtariq

Update README.md

1d35bd5 verified 3 months ago

preview code

raw

history blame contribute delete

1.8 kB

metadata

language: ms
license: apache-2.0
tags:
  - sentiment-analysis
  - malay
  - bert
  - text-classification
datasets:
  - custom
metrics:
  - accuracy
  - f1
model-index:
  - name: ft-Malay-bert
    results:
      - task:
          type: text-classification
          name: Sentiment Analysis
        dataset:
          type: custom
          name: Malay Sentiment Dataset
        metrics:
          - type: accuracy
            value: 0.85
            name: Accuracy

Malay BERT for Sentiment Analysis

Fine-tuned BERT model for Malay sentiment analysis with 3-class classification.

Label Mapping

Important: This model uses the following label mapping:

id2label = {
    0: "negative",
    1: "neutral", 
    2: "positive"
}

label2id = {
    "negative": 0,
    "neutral": 1,
    "positive": 2
}

Quick Usage

from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="rmtariq/ft-Malay-bert")
result = classifier("Saya sangat gembira!")
print(result)
# [{'label': 'LABEL_2', 'score': 0.995}]
# LABEL_2 = positive

Label Interpretation

LABEL_0 or 0 → negative sentiment
LABEL_1 or 1 → neutral sentiment
LABEL_2 or 2 → positive sentiment

Model Details

Language: Malay (Bahasa Malaysia)
Task: Sentiment Analysis
Classes: 3 (negative, neutral, positive)
Base Model: BERT

Training

This model was fine-tuned on Malay sentiment analysis data.

Limitations

Optimized for Malaysian Malay text
May have reduced performance on other Malay dialects
Mixed language performance may vary

Citation

@misc{ft-malay-bert,
  author = {rmtariq},
  title = {Fine-tuned Malay BERT for Sentiment Analysis},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/rmtariq/ft-Malay-bert}
}