The BERDS Benchmark aims to measure retrieval diversity for questions that are opinionated or invite diverse perspectives.
Hung-Ting Chen
timchen0618
·
AI & ML interests
NLP
Recent Activity
updated a Space 2 days ago
timchen0618/monaco-benchmark-viewer published a Space 2 days ago
timchen0618/monaco-benchmark-viewer updated a Space 3 days ago
timchen0618/qampari-dev-viewerOrganizations
spaces 4
Running
MoNaCo Benchmark Viewer
🧩
Explore MoNaCo benchmark questions with answers and evidence
Running
QAMPARI Dev Data Explorer
📚
Explore QAMPARI QA dataset with searchable questions and answers
Sleeping
Browsecomp Plus Viz
😻
Explore and preview UI components in a web catalog
Running
Research Dashboard
📊
View and explore data with the RACA Dashboard
datasets 40
timchen0618/browsecomp-plus-benchmark
Viewer • Updated • 830 • 1.38k
timchen0618/browsecomp-plus-sel-tools-test300-random-seed7-v1
Viewer • Updated • 300 • 160
timchen0618/browsecomp-plus-sel-tools-test300-random-seed6-v1
Viewer • Updated • 300 • 132
timchen0618/browsecomp-plus-sel-tools-test300-random-seed5-v1
Viewer • Updated • 300 • 142
timchen0618/browsecomp-plus-sel-tools-test300-random-seed4-v1
Viewer • Updated • 300 • 123
timchen0618/browsecomp-plus-sel-tools-test300-random-seed3-v1
Viewer • Updated • 300 • 129
timchen0618/browsecomp-plus-sel-tools-test300-random-seed1-v1
Viewer • Updated • 300 • 135
timchen0618/browsecomp-plus-sel-tools-test300-random-seed0-v1
Viewer • Updated • 300 • 136
timchen0618/browsecomp-plus-sel-tools-test300-gemini-3p1-pro-v1
Viewer • Updated • 300 • 145
timchen0618/browsecomp-plus-sel-tools-test300-gemini-2p5-pro-v1
Viewer • Updated • 300 • 134