RLVF pipeline using parser oracles to align LMs for Icelandic and Danish. GPT-SW3 and Viking-13B trained with Delta-DPO.
Fakhar
Hodfa71
AI & ML interests
None yet
Recent Activity
updated a dataset about 12 hours ago
Hodfa71/normistral-11b-nb-saga-kl-sft-delta-dpo-pairs published a dataset about 12 hours ago
Hodfa71/normistral-11b-nb-saga-kl-sft-delta-dpo-pairs updated a dataset about 12 hours ago
Hodfa71/normistral-11b-nb-saga-nosft-delta-dpo-pairs