How does post-training Qwen3.5-35B produce better benchmark results than GPT5.5?

#9
by Ilm-Alan - opened

:๐Ÿค”

Intern Science org

@Ilm-Alan
Thanks so much for your interest in Agents-A1!

For details on how we trained the model, you can check out our tech report. But beyond just the training, we really believe that how you set up the evaluation and the harness is just as crucial in agentic scenarios.

The good news is, we've open-sourced all the evaluation methods and code behind our reported benchmarks. Check them out on GitHub at https://github.com/InternScience/Agents-A1.

Sign up or log in to comment