view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11, 2025 β’ 106
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper β’ 2503.11647 β’ Published Mar 14, 2025 β’ 146
Running 3.68k The Ultra-Scale Playbook π 3.68k The ultimate guide to training LLM on large GPU Clusters
Running 592 Scaling test-time compute π 592 Run advanced LLM search strategies to boost problem solving
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer Paper β’ 2403.10301 β’ Published Mar 15, 2024 β’ 54