Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Spaces:
shutdowngym
/
RedButton-v2
like
0
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
RedButton-v2
/
tests
57.2 kB
Ctrl+K
Ctrl+K
2 contributors
History:
3 commits
Hugging557444
V2-C: three-agent environment + client + demo
449736a
about 1 month ago
__init__.py
Safe
0 Bytes
V2-A: bootstrap RedButton v2 with v1 verbatim reusables
about 1 month ago
test_audit_v2.py
Safe
8.54 kB
V2-B: core v2 modules + tests (auditor, deception, sandbagging)
about 1 month ago
test_auditor.py
Safe
3.36 kB
V2-B: core v2 modules + tests (auditor, deception, sandbagging)
about 1 month ago
test_environment_v2.py
Safe
11.1 kB
V2-C: three-agent environment + client + demo
about 1 month ago
test_failure_modes.py
Safe
6.34 kB
V2-C: three-agent environment + client + demo
about 1 month ago
test_models_v2.py
Safe
1.46 kB
V2-B: core v2 modules + tests (auditor, deception, sandbagging)
about 1 month ago
test_operator.py
Safe
4.98 kB
V2-B: core v2 modules + tests (auditor, deception, sandbagging)
about 1 month ago
test_problems.py
Safe
2.64 kB
V2-A: bootstrap RedButton v2 with v1 verbatim reusables
about 1 month ago
test_restricted_python.py
Safe
3.72 kB
V2-A: bootstrap RedButton v2 with v1 verbatim reusables
about 1 month ago
test_rubrics_v2.py
Safe
6.25 kB
V2-B: core v2 modules + tests (auditor, deception, sandbagging)
about 1 month ago
test_sandbox.py
Safe
4.56 kB
V2-A: bootstrap RedButton v2 with v1 verbatim reusables
about 1 month ago
test_tiers_v2.py
Safe
2.08 kB
V2-B: core v2 modules + tests (auditor, deception, sandbagging)
about 1 month ago
test_timer.py
Safe
2.24 kB
V2-A: bootstrap RedButton v2 with v1 verbatim reusables
about 1 month ago