Hiro-Layout / EVALUATION.md
GradientDescentMe's picture
Update model card and clean release files
b81a426 verified

Evaluation Results

The benchmark contains two evaluation sets:

  • Patent PDF
  • NPD PDF

Metrics are reported as Precision, Recall, and F1. The ALL rows below are copied from the workbook.

Summary

Benchmark Labels Precision Recall F1
Patent PDF 33,054 0.8144 0.7711 0.7922
NPD PDF 17,769 0.7090 0.6983 0.7036

Patent PDF

# Group Abbr. Class Chinese Labels Precision Recall F1
1 figure graph graph 图表 215 0.7611 0.8000 0.7800
2 figure draw drawing 绘制图 420 0.8649 0.3048 0.4507
3 figure struc structure diagram 结构图 626 0.6579 0.8355 0.7361
4 figure photo photograph 照片 147 0.8378 0.8435 0.8407
5 figure tab table 表格 198 0.7759 0.9091 0.8372
6 figure eqn math equation 数学公式 399 0.7762 0.6692 0.7187
7 figure chem chemical formula 化学式 1,099 0.8792 0.8944 0.8868
8 figure noise noise 噪声 1,241 0.7025 0.7687 0.7341
9 text text text 文本 17,668 0.8182 0.8062 0.8122
10 text title title 标题 601 0.9117 0.8070 0.8561
11 text sec section title 章节标题 1,394 0.7968 0.7088 0.7502
12 text head page header 页眉 3,074 0.8187 0.7788 0.7983
13 text foot page footer 页脚 1,012 0.7432 0.6433 0.6896
14 text mnote marginal note 边注 421 0.7794 0.5202 0.6239
15 text cap caption 说明 80 0.6842 0.4875 0.5693
16 text figno figure number 编号 1,389 0.8955 0.7466 0.8143
17 text lineno line number 行号 341 0.7759 0.6598 0.7132
18 text colno column number 栏号 449 0.6964 0.4699 0.5612
19 text seq sequence 序列表 136 0.4430 0.2574 0.3256
20 complex figcx figure complex 图片组 1,416 0.8657 0.7373 0.7963
21 complex rxn chemical reaction 反应式 150 0.8898 0.7000 0.7836
22 complex bib bibliography 著录页 470 0.9615 0.7979 0.8721
23 complex srep search report 搜索报告 106 0.9052 0.9906 0.9459
24 complex toc Table of Contents 目录 0 0.0000 0.0000 0.0000
25 complex ref reference 参考文献 2 0.0000 0.0000 0.0000
ALL 33,054 0.8144 0.7711 0.7922

NPD PDF

# Group Abbr. Class Chinese Labels Precision Recall F1
1 figure graph graph 图表 248 0.6838 0.6976 0.6906
2 figure draw drawing 绘制图 9 0.0000 0.0000 0.0000
3 figure struc structure diagram 结构图 341 0.7454 0.7126 0.7286
4 figure photo photograph 照片 82 0.6071 0.6220 0.6145
5 figure tab table 表格 209 0.7533 0.8182 0.7844
6 figure eqn math equation 数学公式 298 0.6789 0.5604 0.6140
7 figure chem chemical formula 化学式 388 0.7324 0.8325 0.7793
8 figure noise noise 噪声 695 0.4823 0.4302 0.4548
9 text text text 文本 9,119 0.6943 0.7625 0.7268
10 text title title 标题 304 0.7130 0.5395 0.6142
11 text sec section title 章节标题 1,539 0.7337 0.6160 0.6697
12 text head page header 页眉 1,246 0.7464 0.7111 0.7283
13 text foot page footer 页脚 1,339 0.7711 0.6468 0.7035
14 text mnote marginal note 边注 190 0.5714 0.2947 0.3889
15 text cap caption 说明 573 0.8711 0.5899 0.7034
16 text figno figure number 编号 149 0.6078 0.4161 0.4940
17 text lineno line number 行号 41 0.6667 0.9268 0.7755
18 text colno column number 栏号 0 0.0000 0.0000 0.0000
19 text seq sequence 序列表 18 0.7000 0.3889 0.5000
20 complex figcx figure complex 图片组 734 0.7657 0.7480 0.7567
21 complex rxn chemical reaction 反应式 36 0.8947 0.4722 0.6182
22 complex bib bibliography 著录页 0 0.0000 0.0000 0.0000
23 complex srep search report 搜索报告 3 0.4286 1.0000 0.6000
24 complex toc Table of Contents 目录 76 0.8475 0.6579 0.7407
25 complex ref reference 参考文献 132 0.8148 0.3333 0.4731
ALL 17,769 0.7090 0.6983 0.7036

Notes

  • Rows with zero labels are retained because they appear in the source workbook.
  • The summary table uses the workbook's ALL rows rather than recomputing aggregate F1 from per-class rows.
  • The benchmark is in-house and should be described as such unless the dataset is later cleared for public release.