Benjamin Therien's picture

7

Benjamin Therien

btherien

·

https://bentherien.github.io/

AI & ML interests

Passionate about machine learning research! Currently working on efficient foundation model pre-training and learned optimization.

Recent Activity

upvoted a paper 1 day ago

The Reward Was in Your Data All Along: Correcting Flow Matching with Discriminator-Guided RL

upvoted a paper 4 months ago

Privileged Information Distillation for Language Models

upvoted a paper 12 months ago

How to Train Your LLM Web Agent: A Statistical Diagnosis

View all activity

Organizations

upvoted a paper 1 day ago

The Reward Was in Your Data All Along: Correcting Flow Matching with Discriminator-Guided RL

Paper • 2606.19162 • Published 3 days ago • 18

upvoted a paper 4 months ago

Privileged Information Distillation for Language Models

Paper • 2602.04942 • Published Feb 4 • 28

upvoted a paper 12 months ago

How to Train Your LLM Web Agent: A Statistical Diagnosis

Paper • 2507.04103 • Published Jul 5, 2025 • 52

upvoted an article 12 months ago

Article

How to Train Your LLM Web Agent: A Statistical Diagnosis

ppEmiliano

•

Jul 8, 2025

• 15

updated a dataset about 1 year ago

btherien/edufineweb100BT-tokenized

Updated May 20, 2025 • 148

published a dataset about 1 year ago

btherien/edufineweb100BT-tokenized

Updated May 20, 2025 • 148

updated a dataset about 1 year ago

btherien/imagenet-64x64x3

Updated Apr 21, 2025 • 5

published a dataset about 1 year ago

btherien/imagenet-64x64x3

Updated Apr 21, 2025 • 5

updated a dataset about 1 year ago

btherien/lm1b

Updated Apr 21, 2025 • 1.42k

published a dataset about 1 year ago

btherien/lm1b

Updated Apr 21, 2025 • 1.42k

updated a dataset about 1 year ago

btherien/imagenet-32x32x3

Updated Apr 17, 2025 • 4

published a dataset about 1 year ago

btherien/imagenet-32x32x3

Updated Apr 17, 2025 • 4

updated a dataset about 1 year ago

btherien/edufineweb-tokenized

Updated Apr 16, 2025 • 153

published a dataset about 1 year ago

btherien/edufineweb-tokenized

Updated Apr 16, 2025 • 153

updated a model about 1 year ago

btherien/mulo

Updated Apr 8, 2025 • 7

published a model about 1 year ago

btherien/mulo

Updated Apr 8, 2025 • 7

upvoted a collection almost 2 years ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 712

updated a collection almost 2 years ago

Continual Pre-training

Models from Simple and Scalable Strategies to Continually Pre-train Large Language Models • 0 items • Updated Apr 7