Commit History

fix: recover empty phoneme_id_map for ubl, tzo-chenalhó, ubu from MMS vocab.txt
3aab73c
verified

Jarbas commited on

Fix pad/blank token (capture tokenizer_config pad_token)
a4648aa
verified

Jarbas commited on

Fix pad/blank token (capture tokenizer_config pad_token)
a514879
verified

Jarbas commited on

Fix pad/blank token (capture tokenizer_config pad_token)
538dea8
verified

Jarbas commited on

Fix pad/blank token (capture tokenizer_config pad_token)
218de03
verified

Jarbas commited on

Fix pad/blank token (capture tokenizer_config pad_token)
9d77f91
verified

Jarbas commited on

Fix pad/blank token (capture tokenizer_config pad_token)
e76bba3
verified

Jarbas commited on

Fix pad/blank token (capture tokenizer_config pad_token)
bb9fb82
verified

Jarbas commited on

Set alphabet=latin (uroman output_alphabet == input_alphabet)
922fba4
verified

Jarbas commited on

Set alphabet=latin (uroman output_alphabet == input_alphabet)
53a7ee7
verified

Jarbas commited on

Set alphabet=latin (uroman output_alphabet == input_alphabet)
d5640b1
verified

Jarbas commited on

Set alphabet=latin (uroman output_alphabet == input_alphabet)
f24c95e
verified

Jarbas commited on

Set alphabet=latin (uroman output_alphabet == input_alphabet)
3bfc7d2
verified

Jarbas commited on

Set phoneme_type=uroman (MMS is_uroman model)
e3ced1b
verified

Jarbas commited on

Set phoneme_type=uroman (MMS is_uroman model)
baf7c8a
verified

Jarbas commited on

Set phoneme_type=uroman (MMS is_uroman model)
3d46ab0
verified

Jarbas commited on

Set phoneme_type=uroman (MMS is_uroman model)
1d6049a
verified

Jarbas commited on

Set phoneme_type=uroman (MMS is_uroman model)
710163e
verified

Jarbas commited on

Add self-contained config.json (inline phoneme_id_map + tokenizer flags)
ae3d1ea
verified

Jarbas commited on

Add self-contained config.json (inline phoneme_id_map + tokenizer flags)
cdbf516
verified

Jarbas commited on

Add self-contained config.json (inline phoneme_id_map + tokenizer flags)
0475699
verified

Jarbas commited on

Add self-contained config.json (inline phoneme_id_map + tokenizer flags)
076dc94
verified

Jarbas commited on

Add self-contained config.json (inline phoneme_id_map + tokenizer flags)
947c60f
verified

Jarbas commited on

Add self-contained config.json (inline phoneme_id_map + tokenizer flags)
b991988
verified

Jarbas commited on

Add self-contained config.json (inline phoneme_id_map + tokenizer flags)
88e55b0
verified

Jarbas commited on

migrate xtm from willwade
b3b6481
verified

Jarbas commited on

migrate khm from willwade
11fada4
verified

Jarbas commited on

lang_code -> es-CL (BCP-47)
80e39a4
verified

Jarbas commited on

lang_code -> es-AR (BCP-47)
c1d09ee
verified

Jarbas commited on

lang_code -> es-CO (BCP-47)
e73b4fe
verified

Jarbas commited on

lang_code -> ta (BCP-47)
ea248b1
verified

Jarbas commited on

lang_code -> pt-BR (BCP-47)
9e10df7
verified

Jarbas commited on

lang_code -> gu-IN (BCP-47)
4337ac9
verified

Jarbas commited on

lang_code -> mr-IN (BCP-47)
7951318
verified

Jarbas commited on

lang_code -> wal-Ethi (BCP-47)
c218580
verified

Jarbas commited on

lang_code -> wal-Latn (BCP-47)
e7c94f3
verified

Jarbas commited on

lang_code -> cy-GB (BCP-47)
fc51b12
verified

Jarbas commited on

lang_code -> vi-VN (BCP-47)
6cfb820
verified

Jarbas commited on

lang_code -> uz-Cyrl (BCP-47)
d28d6df
verified

Jarbas commited on

lang_code -> ug-Cyrl (BCP-47)
17ad3d5
verified

Jarbas commited on

lang_code -> ug-Arab (BCP-47)
a1737f2
verified

Jarbas commited on

lang_code -> ur-Latn (BCP-47)
9596e94
verified

Jarbas commited on

lang_code -> ur (BCP-47)
2618db2
verified

Jarbas commited on

lang_code -> ur-Deva (BCP-47)
61bd99b
verified

Jarbas commited on

lang_code -> uk-UA (BCP-47)
8e707d4
verified

Jarbas commited on

lang_code -> tzj-x-eastern (BCP-47)
6a8f4f7
verified

Jarbas commited on

lang_code -> tzj-x-western (BCP-47)
cbf32e8
verified

Jarbas commited on

lang_code -> tzo-x-chamula (BCP-47)
0fd2a7c
verified

Jarbas commited on

lang_code -> tzo-x-chenalho (BCP-47)
b98abe3
verified

Jarbas commited on

lang_code -> tzh-x-tenejapa (BCP-47)
814c751
verified

Jarbas commited on