diff --git a/docs/usage/evaluate/index.md b/docs/usage/evaluate/index.md index 84039e79bd38f8d6f3a20c64eb81dd8a7e20e635..1765903482c87513434e8ccf632ab72f6297982c 100644 --- a/docs/usage/evaluate/index.md +++ b/docs/usage/evaluate/index.md @@ -12,15 +12,19 @@ To evaluate DAN on your dataset: This will, for each evaluated split: 1. Create a YAML file with the evaluation results in the `results` subfolder of the `training.output_folder` indicated in your configuration. -1. Print in the console a metrics Markdown table (see [table example below](#htr-evaluation)). -1. Print in the console a Nerval metrics Markdown table, if the `dataset.tokens` parameter in your configuration is defined (see [table example below](#htr-and-ner-evaluation)). +1. Print in the console a metrics Markdown table (see [HTR example below](#htr-evaluation)). +1. Print in the console a Nerval metrics Markdown table, if the `dataset.tokens` parameter in your configuration is defined (see [HTR and NER example below](#htr-and-ner-evaluation)). +1. Print in the console the 5 worst predictions (see [examples below](#examples)). + +!!! warning + The display of the worst predictions does not support batch evaluation. If the `training.data.batch_size` parameter is not equal to `1`, then the `WER` displayed is the `WER` of the **whole batch** and not just the image. | Parameter | Description | Type | Default | | -------------------- | -------------------------------------------------------------------------------------------- | -------------- | ------- | | `--config` | Path to the configuration file. | `pathlib.Path` | | | `--nerval-threshold` | Distance threshold for the match between gold and predicted entity during Nerval evaluation. | `float` | `0.3` | -## Example output +## Examples ### HTR evaluation @@ -32,6 +36,14 @@ This will, for each evaluated split: | train | x | x | x | x | x | | val | x | x | x | x | x | | test | x | x | x | x | x | + +#### 5 worst prediction(s) + +| Image name | WER | Alignment between ground truth - prediction | +| :------------: | :-: | :-----------------------------------------: | +| <image_id>.png | x | x | +| | | | | +| | | x | ``` ### HTR and NER evaluation @@ -67,4 +79,12 @@ This will, for each evaluated split: | :-----: | :-------: | :-----: | :-------: | :----: | :-: | :-----: | | Surname | x | x | x | x | x | x | | All | x | x | x | x | x | x | + +#### 5 worst prediction(s) + +| Image name | WER | Alignment between ground truth - prediction | +| :------------: | :-: | :-----------------------------------------: | +| <image_id>.png | x | x | +| | | | | +| | | x | ```