RLVF pipeline using parser oracles to align LMs for Icelandic and Danish. GPT-SW3 and Viking-13B trained with Delta-DPO.
Fakhar
Hodfa71
AI & ML interests
None yet
Recent Activity
updated a model about 10 hours ago
Hodfa71/llama-3.2-1b-is-saga-kl-sft-delta-dpo published a model about 10 hours ago
Hodfa71/llama-3.2-1b-is-saga-kl-sft-delta-dpo updated a model 1 day ago
Hodfa71/llama-3.1-8b-is-saga-kl-sft-delta-dpo