RLHF Resources - a eZWALT Collection

eZWALT 's Collections

Production LLMs

Multimodal NanoChimera

Pretraining Corpora

Cursed Toxic Pretraining Corpora

RLHF Resources

updated Oct 21, 2025

HuggingFaceTB/SmolLM2-135M-Instruct

Text Generation • 0.1B • Updated Sep 22, 2025 • 1.3M • 327
Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 39.9k • 1.74k
allenai/ultrafeedback_binarized_cleaned_train

Viewer • Updated Aug 28, 2024 • 61.8k • 81 • 1
arnir0/Tiny-LLM

Text Generation • 13M • Updated Nov 3, 2024 • 60k • 50
trl-lib/hh-rlhf-helpful-base

Viewer • Updated Jan 8, 2025 • 46.2k • 141 • 3
yitingxie/rlhf-reward-datasets

Viewer • Updated Jan 1, 2023 • 81.4k • 156 • 65