5CD-AI
/

visocial-T5-base

text2text-generation

Model card Files Files and versions

khang119966 commited on 9 days ago

Commit

5a21fb7

·

verified ·

1 Parent(s): 59dbe41

Update README.md

Files changed (1) hide show

README.md +3 -8

README.md CHANGED Viewed

@@ -1,17 +1,12 @@
 ---
-library_name: transformers
-tags: []
-pipeline_tag: text2text-generation
-widget:
-- text: Dành cho <extra_id_0> hàng th <extra_id_1>iết khi mua xe tay ga và Super Cub (khách hàng mua xe <extra_id_2>1/2017).</s> 🍓 Mua góp lã <extra_id_3>ất  <extra_id_4> dẫn c <extra_id_5> từ  <extra_id_6></s> 🍓 Mua góp nhận <extra_id_7> vẹt gốc <extra_id_8></s>
-  example_title: Example 1
 ---
 # 5CD-AI/visocial-T5-base
 ## Overview
 <!-- Provide a quick summary of what the model is/does. -->
 We trimmed vocabulary size to 50,589 and continually pretrained `google/mt5-base`[1] on a merged 20GB dataset, the training dataset includes:
-- Crawled data (100M comments and 15M posts on Facebook)
 - UIT data[2], which is used to pretrain `uitnlp/visobert`[2]
 - MC4 ecommerce
 - 10.7M comments on VOZ Forum from `tarudesu/VOZ-HSD`[7]
@@ -201,4 +196,4 @@ We fine-tune `5CD-AI/visocial-T5-base` on 3 downstream tasks with `transformers`
 [7] [ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model](https://arxiv.org/abs/2405.14141)
-[8] [ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation](https://aclanthology.org/2022.naacl-srw.18/)

 ---
+pipeline_tag: fill-mask
 ---
 # 5CD-AI/visocial-T5-base
 ## Overview
 <!-- Provide a quick summary of what the model is/does. -->
 We trimmed vocabulary size to 50,589 and continually pretrained `google/mt5-base`[1] on a merged 20GB dataset, the training dataset includes:
+- Crawled data (Million of comments and posts on Facebook)
 - UIT data[2], which is used to pretrain `uitnlp/visobert`[2]
 - MC4 ecommerce
 - 10.7M comments on VOZ Forum from `tarudesu/VOZ-HSD`[7]
 [7] [ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model](https://arxiv.org/abs/2405.14141)
+[8] [ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation](https://aclanthology.org/2022.naacl-srw.18/)