Ik-hwan Kim

12kimih

https://github.com/12kimih

AI & ML interests

Large Language Models, Reinforcement Learning, Multimodal AI, AI Agents, Mechanistic Interpretability

Recent Activity

updated a dataset about 1 month ago

12kimih/nemotron-math-v2-hard

published a dataset about 1 month ago

12kimih/nemotron-math-v2-hard

updated a dataset about 1 month ago

12kimih/nemotron-math-v2-medium

View all activity

Organizations

None yet

upvoted an article 6 months ago

Article

Why Did MiniMax M2 End Up as a Full Attention Model?

MiniMax-AI

•

Oct 30, 2025

• 80

upvoted a paper 7 months ago

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Paper • 2510.08565 • Published Oct 9, 2025 • 21

upvoted a paper 12 months ago

Negative-Guided Subject Fidelity Optimization for Zero-Shot Subject-Driven Generation

Paper • 2506.03621 • Published Jun 4, 2025 • 22

upvoted a collection about 1 year ago

DataDecide

Collection

A suite of models, data, and evals over 25 corpora, 14 sizes, and 3 seeds to measure how accurately small experiments predict rankings at large scale. • 354 items • Updated Mar 2 • 26

upvoted 7 articles about 1 year ago

Article

Open-source DeepResearch – Freeing our search agents

m-ric, albertvillanova, merve, thomwolf, clefourrier

•

Feb 4, 2025

• 1.32k

Article

How NuminaMath Won the 1st AIMO Progress Prize

yfleureau, liyongsea, edbeeching, lewtun, benlipkin, romansoletskyi, vwxyzjn, kashif

•

Jul 11, 2024

• 128

Article

SmolLM - blazingly fast and remarkably powerful

loubnabnl, anton-l, eliebak

•

Jul 16, 2024

• 456

Article

Preference Tuning LLMs with Direct Preference Optimization Methods

kashif, edbeeching, lewtun, lvwerra, osanseviero

•

Jan 18, 2024

• 83

Article

Mixture of Experts Explained

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.13k

Article

Open-R1: a fully open reproduction of DeepSeek-R1

eliebak, lvwerra, lewtun

•

Jan 28, 2025

• 889

Article

Putting RL back in RLHF

vwxyzjn, ArashAhmadian

•

Jun 12, 2024

• 111

Ik-hwan Kim

AI & ML interests

Recent Activity

Organizations

12kimih's activity

Why Did MiniMax M2 End Up as a Full Attention Model?

Open-source DeepResearch – Freeing our search agents

How NuminaMath Won the 1st AIMO Progress Prize

SmolLM - blazingly fast and remarkably powerful

Preference Tuning LLMs with Direct Preference Optimization Methods

Mixture of Experts Explained

Open-R1: a fully open reproduction of DeepSeek-R1

Putting RL back in RLHF