GRPO SQL Optimizer
Fine-tuned Qwen/Qwen2.5-0.5B-Instruct with GRPO reinforcement learning
to optimize SQL queries using a DuckDB execution environment.
Results
- Average eval score: 0.7550 (+12.5% above baseline)
- Trained for 100 episodes on 5 SQL optimization tasks
Blog / Writeup
https://huggingface.co/spaces/laterabhi/grpo-sql-optimizer
Training Notebook
Trained on Kaggle GPU T4 x2 using GRPO with verifiable rewards.
- Downloads last month
- -