Adapting Monolingual Models: Data can be Scarce when Language Similarity is High
Paper β’ 2105.02855 β’ Published
How to use GroNLP/bert-base-dutch-cased-gronings with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="GroNLP/bert-base-dutch-cased-gronings") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("GroNLP/bert-base-dutch-cased-gronings")
model = AutoModelForMaskedLM.from_pretrained("GroNLP/bert-base-dutch-cased-gronings")Wietse de Vries β’ Martijn Bartelds β’ Malvina Nissim β’ Martijn Wieling
This model is part of this paper + code:
The best fine-tuned models for Gronings and West Frisian are available on the HuggingFace model hub:
These models are identical to BERTje, but with different lexical layers (bert.embeddings.word_embeddings).
GroNLP/bert-base-dutch-cased (Dutch; source language)GroNLP/bert-base-dutch-cased-gronings (Gronings)GroNLP/bert-base-dutch-cased-frisian (West Frisian)These models share the same fine-tuned Transformer layers + classification head, but with the retrained lexical layers from the models above.
GroNLP/bert-base-dutch-cased-upos-alpino (Dutch)GroNLP/bert-base-dutch-cased-upos-alpino-gronings (Gronings)GroNLP/bert-base-dutch-cased-upos-alpino-frisian (West Frisian)