Spaces:
Sleeping
Sleeping
Commit History
Fix API key priority in eval.py to match inference.py 5b324a8
Fix API key priority: use API_KEY before HF_TOKEN bc62798
Rewrite inference.py to use OpenEnv client and BENCHMARK_URL 32c2a13
Add [START]/[STEP]/[END] structured output to inference.py 4f3e2c1
Add real benchmark results from model comparison experiment fb86de2
Tarkeshwar Claude Opus 4.6 (1M context) commited on
Improve inference strategy: filter flagged columns, cap plan size b6e22aa
Tarkeshwar Claude Opus 4.6 (1M context) commited on
Add top-3 differentiators: seed variation, TRL training, benchmarking 5523185
Tarkeshwar Claude Opus 4.6 (1M context) commited on
Fix all code review findings (5 issues) 4812df3
Tarkeshwar Claude Opus 4.6 (1M context) commited on
Transform to competition-grade submission b272983
Tarkeshwar Claude Opus 4.6 (1M context) commited on
Restructure repo to match OpenEnv standard layout 7c6fd7d
Tarkeshwar Claude Opus 4.6 (1M context) commited on
Fix critical issues from code review febcf68
Tarkeshwar Claude Opus 4.6 (1M context) commited on
Improve inference agent for better scores c2ad419
Tarkeshwar Claude Opus 4.6 (1M context) commited on
Fix stateful HTTP endpoints and add HF Spaces deployment config bb2fc43
Tarkeshwar Claude Opus 4.6 (1M context) commited on