🔄 In a Training Loop

Asankhaya Sharma

codelion

hugging-science

·

http://asankhaya.github.io/

AI & ML interests

Creator of OptiLLM, OpenEvolve, Adaptive Classifier, and Ellora. Pioneering a new category in AI infrastructure: inference-time compute for LLMs.

Recent Activity

updated a dataset about 15 hours ago

adaptive-classifier/ai-detector-data

reacted to theirpost with 🤗 1 day ago

SPROG-9M — a 9.37M parameter model trained from scratch to solve GSM8K-style math without using an LLM at inference. The model, https://huggingface.co/codelion/sprog-9m, predicts symbolic programs over number slots, then a deterministic executor does the arithmetic. With a simple verifier, it reaches ~11.8% on GSM8K test. We also released the dataset: https://huggingface.co/datasets/codelion/gsm8k-synth, 117K validated synthetic GSM8K-style problems. Tiny model, no pretraining, no LLM at inference, runs on a laptop.

reacted to theirpost with 👀 1 day ago

SPROG-9M — a 9.37M parameter model trained from scratch to solve GSM8K-style math without using an LLM at inference. The model, https://huggingface.co/codelion/sprog-9m, predicts symbolic programs over number slots, then a deterministic executor does the arithmetic. With a simple verifier, it reaches ~11.8% on GSM8K test. We also released the dataset: https://huggingface.co/datasets/codelion/gsm8k-synth, 117K validated synthetic GSM8K-style problems. Tiny model, no pretraining, no LLM at inference, runs on a laptop.

View all activity

Organizations

Posts 48

Post

103

SPROG-9M — a 9.37M parameter model trained from scratch to solve GSM8K-style math without using an LLM at inference.

The model, codelion/sprog-9m, predicts symbolic programs over number slots, then a deterministic executor does the arithmetic. With a simple verifier, it reaches ~11.8% on GSM8K test.

We also released the dataset: codelion/gsm8k-synth, 117K validated synthetic GSM8K-style problems.

Tiny model, no pretraining, no LLM at inference, runs on a laptop.

Articles 16

Article

1

SPROG-9M: how far a 9-million-parameter, LLM-free model gets on grade-school math

View all Articles

Collections 8

View 8 collections

Papers 5

arxiv:2506.08060

arxiv:2501.14249

arxiv:2407.18521

arxiv:2407.16557

spaces 11

dhara-chat

CPU demo of dhara-250M tri-mode (AR/diffusion/self-spec)

PTS Visualizer

Visualize pivotal tokens and thought anchors in language models

Safety Copilot

Ask about any health & safety related queries

Svg2png

Convert SVG to PNG with specified dimensions

MLX My Repo

Convert and upload Hugging Face models to MLX format

LLMSearchEngine

Search for information using LLM

models 33

codelion/sprog-9m

Question Answering • Updated 18 days ago • 198 • 3

codelion/dhara-250m-ar-base

Text Generation • 0.2B • Updated 18 days ago • 18 • 1

codelion/dhara-250m

Text Generation • 0.2B • Updated 18 days ago • 217 • 3

codelion/SmolLM2-70M

Text Generation • 69.2M • Updated Mar 8 • 52 • 3

codelion/malm-165m

Text Generation • Updated Jan 23 • 25 • 4

codelion/dhara-70m

Text Generation • 71.3M • Updated Dec 30, 2025 • 295 • 49

codelion/gpt-2-70m

Text Generation • 64.1M • Updated Nov 2, 2025 • 17 • 21

codelion/Qwen3-4B-execution-world-model-lora

Text Generation • Updated Oct 20, 2025 • 5 • 6

codelion/Qwen2.5-Coder-0.5B-Instruct-security-grpo-lora

Text Generation • Updated Aug 2, 2025 • 63

codelion/qwen2-5-coder-0-5b-instruct-progressive-2000k-lora

Text Generation • Updated Jul 20, 2025 • 6 • 2

datasets 47

codelion/logical-puzzles-cot

Viewer • Updated 14 days ago • 22.2k • 193 • 3

codelion/gsm8k-synth

Viewer • Updated 25 days ago • 118k • 74

codelion/sutra-improved-100M

Viewer • Updated Mar 29 • 414k • 34 • 2

codelion/sutra-magpie-sft

Viewer • Updated Mar 8 • 20.7k • 29 • 2

codelion/sutra-30k-seeds

Viewer • Updated Mar 8 • 30.3k • 40 • 2

codelion/sutra-10M

Viewer • Updated Mar 8 • 7.25k • 35 • 3

codelion/sutra-100M

Viewer • Updated Mar 8 • 70.4k • 81 • 2

codelion/sutra-1B

Viewer • Updated Mar 8 • 429k • 83 • 2

codelion/sutra-10B

Viewer • Updated Mar 8 • 5M • 185 • 8

codelion/synth-1B

Viewer • Updated Nov 11, 2025 • 822k • 17 • 1

View 47 datasets