Commit History

Upload training/train_grpo.py with huggingface_hub
06b7563
verified

piyush-mk commited on

v6: GRPO with --resume-adapter to start from SFT checkpoint
59ed7f3
verified

piyush-mk commited on

v5d: save best-epoch checkpoint during SFT
a674764
verified

piyush-mk commited on

v5c: add --min-investigation-steps filter for submit-only with longer contexts
2857a20
verified

piyush-mk commited on

v5b: add --submit-only SFT mode
2a667e7
verified

piyush-mk commited on

v5: remove 4-bit quant, variable-length traces, submit oversampling, GRPO exploration fixes
1d75102
verified

piyush-mk commited on

Upload folder using huggingface_hub
4cd2128
verified

piyush-mk commited on

Fix: decode with skip_special_tokens=False to preserve <think> tags for stripping
231d76f
verified

piyush-mk commited on

Fix Qwen3 thinking mode + increase max_new_tokens: training/launch_hf_job.py
befff2d
verified

piyush-mk commited on

Fix Qwen3 thinking mode + increase max_new_tokens: training/eval_adapter.py
664fa8e
verified

piyush-mk commited on

Fix Qwen3 thinking mode + increase max_new_tokens: training/train_sft.py
7c48dcf
verified

piyush-mk commited on

Fix Qwen3 thinking mode + increase max_new_tokens: training/train_grpo.py
9d40fed
verified

piyush-mk commited on

Fix Qwen3 thinking mode + increase max_new_tokens: training/rollout.py
f788adf
verified

piyush-mk commited on

Fix Qwen3 thinking mode + increase max_new_tokens: inference.py
4afdc43
verified

piyush-mk commited on

Add InvoiceGuard SFT and merge tooling
25ee43d
verified

piyush-mk commited on

Add InvoiceGuard adapter eval job script
90c05cf
verified

piyush-mk commited on

Sync InvoiceGuard code for GRPO training job
02767f3
verified

piyush-mk commited on

Sync InvoiceGuard code for GRPO training job
7db1128
verified

piyush-mk commited on

Sync InvoiceGuard code for GRPO training job
7c278c9
verified

piyush-mk commited on

Sync InvoiceGuard code for GRPO training job
c5cfe6d
verified

piyush-mk commited on

Sync InvoiceGuard code for GRPO training job
d1e9e92
verified

piyush-mk commited on

Sync InvoiceGuard code for GRPO training job
c01c661
verified

piyush-mk commited on

Sync InvoiceGuard code for GRPO training job
fdd2256
verified

piyush-mk commited on

Sync InvoiceGuard code for GRPO training job
25e6791
verified

piyush-mk commited on

Sync InvoiceGuard code for GRPO training job
9a88af0
verified

piyush-mk commited on

initial commit
98115dc
verified

piyush-mk commited on