Post
3108
Introducing Rain-v2: Democratizing LLM training on gaming GPUs! β‘
βFollowing Rain-100M, weβre scaling up. Rain-v2 features a larger training dataset.
Weβve published a comprehensive blog covering the end-to-end journeyβfrom raw data collection to rigorous evaluation and safety testing.
βHF Repo: π€ raincandy-u/Rain-v2
βBlog: π
https://angelkawaii.xyz/2026/01/29/rain-v2/
βSpecial thanks to the open-source community and the SmolLM2 team for their foundational work! π
HuggingFaceTB
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model (2502.02737)
βFollowing Rain-100M, weβre scaling up. Rain-v2 features a larger training dataset.
Weβve published a comprehensive blog covering the end-to-end journeyβfrom raw data collection to rigorous evaluation and safety testing.
βHF Repo: π€ raincandy-u/Rain-v2
βBlog: π
https://angelkawaii.xyz/2026/01/29/rain-v2/
βSpecial thanks to the open-source community and the SmolLM2 team for their foundational work! π
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model (2502.02737)