陽菜斎藤's picture

9 3

陽菜斎藤

ahernandez2023

AI & ML interests

Research on LLM agents and evaluation.

Recent Activity

upvoted a paper 1 day ago

OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data

upvoted a paper 3 days ago

WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

upvoted a paper 4 days ago

Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

View all activity

Organizations

None yet

upvoted a paper 1 day ago

OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data

Paper • 2606.13432 • Published 5 days ago • 96

upvoted a paper 3 days ago

WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

Paper • 2606.09426 • Published 8 days ago • 100

upvoted 3 papers 4 days ago

Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

Paper • 2606.08063 • Published 10 days ago • 76

ABot-Earth 0.5: Generative 3D Earth Model

Paper • 2606.09967 • Published 8 days ago • 467

InterleaveThinker: Reinforcing Agentic Interleaved Generation

Paper • 2606.13679 • Published 5 days ago • 77

upvoted a paper 5 days ago

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

Paper • 2606.11926 • Published 6 days ago • 110

upvoted 3 papers 14 days ago

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Paper • 2605.30611 • Published 19 days ago • 193

ChildVox: A Speech, Audio, and Large Audio-Language Model Benchmark in Understanding and Characterizing Sound across Childhood

Paper • 2605.29257 • Published 19 days ago • 10

Is Position Bias in Dense Retrievers Built In-or Learned from Data?

Paper • 2605.26578 • Published 21 days ago • 20