Update README.md
Browse files
README.md
CHANGED
|
@@ -1,17 +1,12 @@
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
tags: []
|
| 4 |
-
pipeline_tag: text2text-generation
|
| 5 |
-
widget:
|
| 6 |
-
- text: Dành cho <extra_id_0> hàng th <extra_id_1>iết khi mua xe tay ga và Super Cub (khách hàng mua xe <extra_id_2>1/2017).</s> 🍓 Mua góp lã <extra_id_3>ất <extra_id_4> dẫn c <extra_id_5> từ <extra_id_6></s> 🍓 Mua góp nhận <extra_id_7> vẹt gốc <extra_id_8></s>
|
| 7 |
-
example_title: Example 1
|
| 8 |
---
|
| 9 |
|
| 10 |
# 5CD-AI/visocial-T5-base
|
| 11 |
## Overview
|
| 12 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 13 |
We trimmed vocabulary size to 50,589 and continually pretrained `google/mt5-base`[1] on a merged 20GB dataset, the training dataset includes:
|
| 14 |
-
- Crawled data (
|
| 15 |
- UIT data[2], which is used to pretrain `uitnlp/visobert`[2]
|
| 16 |
- MC4 ecommerce
|
| 17 |
- 10.7M comments on VOZ Forum from `tarudesu/VOZ-HSD`[7]
|
|
@@ -201,4 +196,4 @@ We fine-tune `5CD-AI/visocial-T5-base` on 3 downstream tasks with `transformers`
|
|
| 201 |
|
| 202 |
[7] [ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model](https://arxiv.org/abs/2405.14141)
|
| 203 |
|
| 204 |
-
[8] [ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation](https://aclanthology.org/2022.naacl-srw.18/)
|
|
|
|
| 1 |
---
|
| 2 |
+
pipeline_tag: fill-mask
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
# 5CD-AI/visocial-T5-base
|
| 6 |
## Overview
|
| 7 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 8 |
We trimmed vocabulary size to 50,589 and continually pretrained `google/mt5-base`[1] on a merged 20GB dataset, the training dataset includes:
|
| 9 |
+
- Crawled data (Million of comments and posts on Facebook)
|
| 10 |
- UIT data[2], which is used to pretrain `uitnlp/visobert`[2]
|
| 11 |
- MC4 ecommerce
|
| 12 |
- 10.7M comments on VOZ Forum from `tarudesu/VOZ-HSD`[7]
|
|
|
|
| 196 |
|
| 197 |
[7] [ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model](https://arxiv.org/abs/2405.14141)
|
| 198 |
|
| 199 |
+
[8] [ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation](https://aclanthology.org/2022.naacl-srw.18/)
|