Finetuning Base Falcon on Unseen Language/New data (non instruct/RLHF)

#91

by AshBam - opened Jul 14, 2023

Jul 14, 2023

•

edited Jul 14, 2023

I understand that the Falcon model is not meant to work on unseen languages (listed as a limitation). However, I need to do so. An instruct only finetuning is giving pretty unstable results at the moment. I've scourged the internet to find a resource to help with the same but have not been able to find the same.

Does anyone has any idea on how it can be made possible? I've been trying to go through the PEFT, LORA, deepspeed libraries related to Falcon to get some idea on reverse engineering the process. Understand how to adding new layers on top of the frozen layers and if it might be possible to unfreeze and tune other layers. However, I've not been able to find something workable.

Please help me out if there are any resources for this.

cmp-nct

Jul 14, 2023

I suppose no one tried, doesn't mean it does not work.
Personally I'd try careful fine tuning of the embeddings using dictionaries of that particular language in combination with all languages Falcon knows well, so it can find connections of the new words with existing words.
Then the same on sentences with a large corpus of untrained examples to regularly test the progress.

AshBam

Jul 19, 2023

Thanks will try something out.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment