Snippets Groups Projects

Implement the Tree Edit Distance metric

The Tree Edit Distance (TED) is a Levenshtein distance based on tree representations. It is independent of reading order and works with nested entities.

From the Donut paper

To assess overall accuracy, we also use another metric based on TED [68], that can be used for any documents represented as trees. It is calculated as, max(0, 1−TED(pr, gt)/TED(ϕ, gt)), where gt, pr, and ϕ stands for ground truth, predicted, and empty trees respectively. Similar metrics are used in recent works on document IE [70,23].

Original paper:

[68] Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput. 18, 1245–1262 (12 1989). https://doi.org/10.1137/0218082

Used in:

[70] Zhong, X., ShafieiBavani, E., Jimeno Yepes, A.: Image-based table recognition: Data, model, and evaluation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) Computer Vision – ECCV 2020. pp. 564–580. Springer International Publishing, Cham (2020) https://arxiv.org/pdf/1911.10683.pdf

[23] Hwang, W., Lee, H., Yim, J., Kim, G., Seo, M.: Cost-effective end-to-end information extraction for semi-structured document images. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. pp. 3375–3383. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic (Nov 2021). https://doi.org/10.18653/v1/2021.emnlp-main.271, https://aclanthology.org/2021.emnlp-main.271

Questions:

what cost for character/word edition
what cost for label edition

Designs

Child items ...

Activity

Solene Tarride added P2 label 1 year ago

added P2 label
Solene Tarride assigned to @starride 1 year ago

assigned to @starride
Solene Tarride created branch implement-tree-edit-distance-metric to address this issue 1 year ago

created branch implement-tree-edit-distance-metric to address this issue
Solene Tarride mentioned in merge request !12 1 year ago

mentioned in merge request !12

Please register or sign in to reply