Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
ihbkaiser
/
trl-mcsd
like
0
arxiv:
2402.03300
arxiv:
2305.18290
arxiv:
2407.21783
Model card
Files
Files and versions
xet
Community
Copy to bucket
new
main
trl-mcsd
/
examples
/
scripts
/
openenv
119 kB
Ctrl+K
Ctrl+K
1 contributor
History:
1 commit
ihbkaiser
Implement MCSD for experimental SDPO
1fa3c6c
verified
about 2 months ago
browsergym.py
10.9 kB
Implement MCSD for experimental SDPO
about 2 months ago
browsergym_llm.py
15.1 kB
Implement MCSD for experimental SDPO
about 2 months ago
carla.py
7.3 kB
Implement MCSD for experimental SDPO
about 2 months ago
carla_vlm.py
9.12 kB
Implement MCSD for experimental SDPO
about 2 months ago
carla_vlm_gemma.py
11.2 kB
Implement MCSD for experimental SDPO
about 2 months ago
catch.py
12.4 kB
Implement MCSD for experimental SDPO
about 2 months ago
echo.py
4.01 kB
Implement MCSD for experimental SDPO
about 2 months ago
multi_env.py
10.1 kB
Implement MCSD for experimental SDPO
about 2 months ago
sudoku.py
25.2 kB
Implement MCSD for experimental SDPO
about 2 months ago
sudoku_prompt.txt
4.3 kB
Implement MCSD for experimental SDPO
about 2 months ago
wordle.py
9.13 kB
Implement MCSD for experimental SDPO
about 2 months ago