Weighted-Reward Preference Optimization for Implicit Model Fusion Paper โข 2412.03187 โข Published Dec 4, 2024 โข 12