Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Yimin Liu
yiminn
5
2
1
Follow
amir499's profile picture
1 follower
·
1 following
Yiminnn
yimin-liu-15681b17a
AI & ML interests
None yet
Recent Activity
new
activity
4 days ago
benchflow/skillsbench-leaderboard:
Add SkillsBench v1.1 agent-decouple integration matrix (7 ACP/decoupled agents x deepseek-v4-flash x with/without skill, docker; 1183 runs / 1014 healthy)
new
activity
17 days ago
benchflow/skillsbench-leaderboard:
Evidence archive: task.md-reformatted SkillsBench 87x3 (openhands + deepseek-v4-flash, 1 trial/condition, 261/261 healthy)
authored
a paper
29 days ago
ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces
View all activity
Organizations
yiminn
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
benchflow/skillsbench-leaderboard
4 days ago
Add SkillsBench v1.1 agent-decouple integration matrix (7 ACP/decoupled agents x deepseek-v4-flash x with/without skill, docker; 1183 runs / 1014 healthy)
#16 opened 4 days ago by
yiminn
New activity in
benchflow/skillsbench-leaderboard
17 days ago
Evidence archive: task.md-reformatted SkillsBench 87x3 (openhands + deepseek-v4-flash, 1 trial/condition, 261/261 healthy)
#13 opened 17 days ago by
yiminn
New activity in
benchflow/skillsbench-leaderboard
about 2 months ago
PR3 OpenHands Chinese PR2-delta 6x3 matrix
#3 opened about 2 months ago by
yiminn
Add SkillsBench useful trajectory snapshot
#2 opened about 2 months ago by
yiminn
Add finished SkillsBench usable trajectories
1
#1 opened about 2 months ago by
yiminn
Add finished SkillsBench usable trajectories
1
#1 opened about 2 months ago by
yiminn