99

cutechicken

·

cutechicken99

AI & ML interests

None yet

Recent Activity

liked a dataset about 7 hours ago

ginigen-ai/Metacognition-Bench

liked a Space about 7 hours ago

ginigen-ai/Metacognition-Leaderboard-Space

reacted to ginigen-ai's post with ❤️ about 8 hours ago

🍳 The RoboCasa Kitchen Leaderboard What does it take for a robot to handle kitchen chores the way a person does? It has to see (Vision), understand instructions (Language), and actually act (Action) — and VLA (Vision-Language-Action) models are emerging as the answer. They're the bridge between large multimodal models and real-world embodied control. RoboCasa Kitchen is a leading robot-learning benchmark in which a single-arm robot (Franka Panda) performs 24 atomic manipulation tasks — picking up cups and bowls, opening drawers and doors, turning faucets, pressing buttons, and more — inside a photorealistic simulated kitchen. Because the layout and object placement are randomized every episode, it tests genuine generalization rather than memorized motions. The score (success rate, SR) is the average fraction of the 24 tasks completed as instructed, measured over multiple seeds so results aren't down to luck. The catch: this benchmark has no official leaderboard, and protocols (number of demonstrations, evaluation setup) differ from paper to paper, leaving scores scattered. Lining the numbers up naively quickly turns into an apples-to-oranges comparison. This leaderboard fixes that by collecting published scores with their sources and comparing only what is genuinely comparable. It's split into three tables: 🏆 Kitchen 24-task (matched) — head-to-head under identical conditions (per the RLDX-1 Technical Report). This is the core ranking you can actually trust. ➕ Other protocols — self-reported under different setups (e.g. fewer demos). Not directly comparable, so kept separate. 🤖 GR1-Tabletop — a different, humanoid-based variant suite, separated to avoid confusion. Any researcher can submit their own model's score directly, and submissions are reviewed before they appear on the board. Every number links to its source paper, so you can verify it yourself. 👉 https://huggingface.co/spaces/ginigen-ai/robocasa-kitchen-leaderboard

View all activity

Organizations

spaces 15

demo_C2S_Scale

InvestmentsStrategyUsingSentiment

Emoji Generator

Translate text to emojis with Gemma 3 270M running on-device

Ovi

Generate a video from an image with a text prompt

Peer Server

Submit prompts to generate images using idle GPUs

3D Air Combat Simulator

One-minute creation by AI Coding Autonomous Agent 'MOUSE-I'

models 0

None public yet

datasets 0

None public yet