Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

ContextualBench

company
https://github.com/MetaMind
Activity Feed

AI & ML interests

None defined yet.

Phi Nguyen's profile picture Shrey Pandit's profile picture

SP2001 
authored 3 papers 2 months ago

Synthesizing Agentic Data for Web Agents with Progressive Difficulty Enhancement Mechanisms

Paper • 2510.13913 • Published Oct 15 • 3

EgoVLM: Policy Optimization for Egocentric Video Understanding

Paper • 2506.03097 • Published Jun 3

Hard2Verify: A Step-Level Verification Benchmark for Open-Ended Frontier Math

Paper • 2510.13744 • Published Oct 15 • 5
SP2001 
authored a paper 4 months ago

SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents

Paper • 2509.06283 • Published Sep 8 • 17
SP2001 
authored 3 papers 10 months ago

CodeUpdateArena: Benchmarking Knowledge Editing on API Updates

Paper • 2407.06249 • Published Jul 8, 2024

SFR-RAG: Towards Contextually Faithful LLMs

Paper • 2409.09916 • Published Sep 16, 2024 • 1

FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"

Paper • 2410.03727 • Published Sep 30, 2024 • 2
nxphi47 
authored a paper about 2 years ago

SeaLLMs -- Large Language Models for Southeast Asia

Paper • 2312.00738 • Published Dec 1, 2023 • 25
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs