Phase 4: V1-aware calibration verifier, eval tools, cleanup 2145d80 jang1563 Claude Sonnet 4.6 commited on 15 days ago
Phase 3: Fix GRPO learning signal with continuous rewards and multi-reward 7dbf475 jang1563 Claude Opus 4.6 commited on 24 days ago
Add BioGRPO training pipeline with composable biological verifiers bff2f94 jang1563 Claude Opus 4.6 commited on 25 days ago