LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation
Paper
•
2402.11485
•
Published
•
1
LEIA is a training technique for autoregressive LLMs that effectively improves their performance in languages other than English by enhancing cross-lingual knowledge transfer from English to a target language. This model is constructed by applying LEIA to Swallow, a Japanese-English bilingual LLM based on LLaMA 2. The model achieves enhanced performance on six Japanese question-answering benchmarks, as reported below.
Please refer to our paper or blog post (in Japanese) for further technical details.
The model is assessed using the following six question answering benchmarks:
| Model | X-CODAH | X-CSQA | JCommonsenseQA | NIILC | JEMHopQA | JAQKET v2 |
|---|---|---|---|---|---|---|
| Swallow | 42.0 | 41.0 | 80.3 | 59.5 | 50.8 | 86.2 |
| LEIA | 42.7 | 42.4 | 80.6 | 60.3 | 54.7 | 86.5 |
For further details of this experiment, please refer to our paper.