license: mit language: ko tags: - hate-speech - classification - korean - electra datasets: - jeanlee/kmhas_korean_hate_speech model_name: kcELECTRA-based Korean Hate Speech Classifier

kcELECTRA-based Korean Hate Speech Classifier

이 λͺ¨λΈμ€ beomi/kcELECTRA-base-v2022λ₯Ό 기반으둜, jeanlee/kmhas_korean_hate_speech 데이터셋을 μ‚¬μš©ν•΄ ν•œκ΅­μ–΄ 혐였 ν‘œν˜„ λΆ„λ₯˜ νƒœμŠ€ν¬μ— 맞좰 νŒŒμΈνŠœλ‹ν•œ λͺ¨λΈμž…λ‹ˆλ‹€.

🧠 λͺ¨λΈ ꡬ쑰

  • βœ… Base Model: kcELECTRA-base-v2022 (ν•œκ΅­μ–΄ μ½”νΌμŠ€ 기반 μ‚¬μ „ν•™μŠ΅ ELECTRA)
  • βœ… Head: Sequence Classification Head (Binary: 혐였 / λΉ„ν˜μ˜€)
  • βœ… Output: label=1 (혐였), label=0 (λΉ„ν˜μ˜€)

πŸ—‚ 데이터셋 정보

  • 좜처: jeanlee/kmhas_korean_hate_speech
  • ν˜•νƒœ: ν…μŠ€νŠΈ + 8κ°€μ§€ 혐였 ν‘œν˜„ λ ˆμ΄λΈ”
  • μ „μ²˜λ¦¬ 방식:
    • 라벨 8 (not_hate_speech)은 0, κ·Έ μ™ΈλŠ” 1둜 binary classification 처리

πŸ‹οΈβ€β™‚οΈ νŒŒμΈνŠœλ‹ 정보

ν•­λͺ© κ°’
Train Epochs 3
Batch Size 16
Optimizer AdamW
Learning Rate 5e-5
Evaluation Metric Accuracy (μΆ”κ°€ κ°€λŠ₯)

πŸš€ μ‚¬μš© μ˜ˆμ‹œ (Inference)

from transformers import pipeline

model = pipeline("text-classification", model="jinkyeongk/kcELECTRA-toxic-detector")

text = "λ„ˆ μ§„μ§œ λͺ»μƒκ²Όλ‹€"
result = model(text)

print(result)
# [{'label': 'LABEL_1', 'score': 0.987}]  ← 혐였
Downloads last month
308
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support