allenai/tmax-9b
9B • Updated • 114 • 6
Data and models associated with "Tmax: A simple recipe for terminal agents". paper: https://arxiv.org/abs/2606.23321
Note RLed model on top of Qwen 3.5 9B, with rollouts and model checkpoints through training.
Note RLed model on top of Qwen 3.6 27B.
Note RLed model on top of Qwen 3.5 4B.
Note RLed model on top of Qwen 3.5 2B.
Note SFTed model on top of Qwen3 8B
Note RLed model on top of tmax-sft-8b
Note RL data, formatted for use w/ open-instruct
Note Data for RL training for RL training for general use
Note SFT data used for tmax-8b
Note SFT data, but with more details for general use below: data/models for various ablations.