Catastrophic Forgetting in Mathematical Reasoning MarioBarbeque/flan-t5-base-math-only-catastrophic 0.2B • Updated Dec 18, 2025 • 23 • 1 MarioBarbeque/flan-t5-base-nli-only-catastrophic 0.2B • Updated Dec 18, 2025 • 20 • 1 MarioBarbeque/flan-t5-base-mixed-1-1-catastrophic 0.2B • Updated Dec 18, 2025 • 11 • 1 MarioBarbeque/flan-t5-base-mixed-3-1-catastrophic 0.2B • Updated Dec 18, 2025 • 3 • 1
Finetuning Models to fine-tune (and datasets to ft with) in future projects nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Text Generation • 71B • Updated Apr 13, 2025 • 10.6k • • 2.07k FacebookAI/roberta-base Fill-Mask • 0.1B • Updated Feb 19, 2024 • 15.4M • • 608 openai-community/gpt2 Text Generation • 0.1B • Updated Feb 19, 2024 • 16M • 3.27k google/gemma-2-9b Text Generation • 9B • Updated Aug 7, 2024 • 79.1k • • 708
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Text Generation • 71B • Updated Apr 13, 2025 • 10.6k • • 2.07k
Code Generation Models and datasets relevant to training code generation models in future projects code-search-net/code_search_net Viewer • Updated Feb 23 • 4.14M • 24.2k • 330 google/codegemma-7b Text Generation • 9B • Updated Aug 7, 2024 • 2.62k • 219 google/codegemma-7b-it Text Generation • 9B • Updated Aug 7, 2024 • 6.66k • 253 transformersbook/codeparrot Viewer • Updated Feb 5, 2022 • 18.7M • 382 • 62
Mathematics Models and datasets related to mathematics generation nvidia/OpenMathInstruct-2 Viewer • Updated Nov 25, 2024 • 22M • 64.5k • 244 microsoft/orca-math-word-problems-200k Viewer • Updated Mar 4, 2024 • 200k • 7.12k • 483 openai/gsm8k Benchmark • Updated Mar 23 • 17.6k • 957k • 1.35k EleutherAI/llemma_34b Text Generation • Updated Apr 3, 2024 • 1.53k • 102
Catastrophic Forgetting in Mathematical Reasoning MarioBarbeque/flan-t5-base-math-only-catastrophic 0.2B • Updated Dec 18, 2025 • 23 • 1 MarioBarbeque/flan-t5-base-nli-only-catastrophic 0.2B • Updated Dec 18, 2025 • 20 • 1 MarioBarbeque/flan-t5-base-mixed-1-1-catastrophic 0.2B • Updated Dec 18, 2025 • 11 • 1 MarioBarbeque/flan-t5-base-mixed-3-1-catastrophic 0.2B • Updated Dec 18, 2025 • 3 • 1
Code Generation Models and datasets relevant to training code generation models in future projects code-search-net/code_search_net Viewer • Updated Feb 23 • 4.14M • 24.2k • 330 google/codegemma-7b Text Generation • 9B • Updated Aug 7, 2024 • 2.62k • 219 google/codegemma-7b-it Text Generation • 9B • Updated Aug 7, 2024 • 6.66k • 253 transformersbook/codeparrot Viewer • Updated Feb 5, 2022 • 18.7M • 382 • 62
Finetuning Models to fine-tune (and datasets to ft with) in future projects nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Text Generation • 71B • Updated Apr 13, 2025 • 10.6k • • 2.07k FacebookAI/roberta-base Fill-Mask • 0.1B • Updated Feb 19, 2024 • 15.4M • • 608 openai-community/gpt2 Text Generation • 0.1B • Updated Feb 19, 2024 • 16M • 3.27k google/gemma-2-9b Text Generation • 9B • Updated Aug 7, 2024 • 79.1k • • 708
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Text Generation • 71B • Updated Apr 13, 2025 • 10.6k • • 2.07k
Mathematics Models and datasets related to mathematics generation nvidia/OpenMathInstruct-2 Viewer • Updated Nov 25, 2024 • 22M • 64.5k • 244 microsoft/orca-math-word-problems-200k Viewer • Updated Mar 4, 2024 • 200k • 7.12k • 483 openai/gsm8k Benchmark • Updated Mar 23 • 17.6k • 957k • 1.35k EleutherAI/llemma_34b Text Generation • Updated Apr 3, 2024 • 1.53k • 102