Fill-Mask
Transformers
Safetensors
mt5
text2text-generation
khang119966 commited on
Commit
5a21fb7
·
verified ·
1 Parent(s): 59dbe41

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -8
README.md CHANGED
@@ -1,17 +1,12 @@
1
  ---
2
- library_name: transformers
3
- tags: []
4
- pipeline_tag: text2text-generation
5
- widget:
6
- - text: Dành cho <extra_id_0> hàng th <extra_id_1>iết khi mua xe tay ga và Super Cub (khách hàng mua xe <extra_id_2>1/2017).</s> 🍓 Mua góp lã <extra_id_3>ất <extra_id_4> dẫn c <extra_id_5> từ <extra_id_6></s> 🍓 Mua góp nhận <extra_id_7> vẹt gốc <extra_id_8></s>
7
- example_title: Example 1
8
  ---
9
 
10
  # 5CD-AI/visocial-T5-base
11
  ## Overview
12
  <!-- Provide a quick summary of what the model is/does. -->
13
  We trimmed vocabulary size to 50,589 and continually pretrained `google/mt5-base`[1] on a merged 20GB dataset, the training dataset includes:
14
- - Crawled data (100M comments and 15M posts on Facebook)
15
  - UIT data[2], which is used to pretrain `uitnlp/visobert`[2]
16
  - MC4 ecommerce
17
  - 10.7M comments on VOZ Forum from `tarudesu/VOZ-HSD`[7]
@@ -201,4 +196,4 @@ We fine-tune `5CD-AI/visocial-T5-base` on 3 downstream tasks with `transformers`
201
 
202
  [7] [ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model](https://arxiv.org/abs/2405.14141)
203
 
204
- [8] [ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation](https://aclanthology.org/2022.naacl-srw.18/)
 
1
  ---
2
+ pipeline_tag: fill-mask
 
 
 
 
 
3
  ---
4
 
5
  # 5CD-AI/visocial-T5-base
6
  ## Overview
7
  <!-- Provide a quick summary of what the model is/does. -->
8
  We trimmed vocabulary size to 50,589 and continually pretrained `google/mt5-base`[1] on a merged 20GB dataset, the training dataset includes:
9
+ - Crawled data (Million of comments and posts on Facebook)
10
  - UIT data[2], which is used to pretrain `uitnlp/visobert`[2]
11
  - MC4 ecommerce
12
  - 10.7M comments on VOZ Forum from `tarudesu/VOZ-HSD`[7]
 
196
 
197
  [7] [ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model](https://arxiv.org/abs/2405.14141)
198
 
199
+ [8] [ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation](https://aclanthology.org/2022.naacl-srw.18/)