Skip to content
Snippets Groups Projects

Evaluation

Description

Use the teklia-dan evaluate command to evaluate a trained DAN model.

To evaluate DAN on your dataset:

  1. Create a JSON configuration file. You can base the configuration file off the training one. Refer to the dedicated page for a description of parameters.
  2. Run teklia-dan evaluate --config path/to/your/config.json.

This will, for each evaluated split:

  1. Create a YAML file with the evaluation results in the results subfolder of the training.output_folder indicated in your configuration.
  2. Print in the console a metrics Markdown table (see HTR example below).
  3. Print in the console a Nerval metrics Markdown table, if the dataset.tokens parameter in your configuration is defined (see HTR and NER example below).
  4. Print in the console the 5 worst predictions (see examples below).

!!! warning

The display of the worst predictions does not support batch evaluation. If the `training.data.batch_size` parameter is not equal to `1`, then the `WER` displayed is the `WER` of the **whole batch** and not just the image.
Parameter Description Type Default
--config Path to the configuration file. pathlib.Path
--nerval-threshold Distance threshold for the match between gold and predicted entity during Nerval evaluation. 0 would impose perfect matches, 1 would allow completely different strings to be considered as a match. float 0.3
--output-json Where to save evaluation results in JSON format. pathlib.Path None
--sets Sets to evaluate. Defaults to train, val, test. list[str] ["train", "val", "test"]

Examples

HTR evaluation

#### DAN evaluation

| Split | CER (HTR-NER) | CER (HTR) | WER (HTR-NER) | WER (HTR) | WER (HTR no punct) |
| :---: | :-----------: | :-------: | :-----------: | :-------: | :----------------: |
| train |       x       |     x     |       x       |     x     |         x          |
|  val  |       x       |     x     |       x       |     x     |         x          |
| test  |       x       |     x     |       x       |     x     |         x          |

#### 5 worst prediction(s)

|   Image name   | WER | Alignment between ground truth - prediction |
| :------------: | :-: | :-----------------------------------------: |
| <image_id>.png |  x  |                      x                      |
|                |     |                      |                      |
|                |     |                      x                      |

HTR and NER evaluation

#### DAN evaluation

| Split | CER (HTR-NER) | CER (HTR) | WER (HTR-NER) | WER (HTR) | WER (HTR no punct) | NER |
| :---: | :-----------: | :-------: | :-----------: | :-------: | :----------------: | :-: |
| train |       x       |     x     |       x       |     x     |         x          |  x  |
|  val  |       x       |     x     |       x       |     x     |         x          |  x  |
| test  |       x       |     x     |       x       |     x     |         x          |  x  |

#### Nerval evaluation

##### train

|   tag   | predicted | matched | Precision | Recall | F1  | Support |
| :-----: | :-------: | :-----: | :-------: | :----: | :-: | :-----: |
| Surname |     x     |    x    |     x     |   x    |  x  |    x    |
|   All   |     x     |    x    |     x     |   x    |  x  |    x    |

##### val

|   tag   | predicted | matched | Precision | Recall | F1  | Support |
| :-----: | :-------: | :-----: | :-------: | :----: | :-: | :-----: |
| Surname |     x     |    x    |     x     |   x    |  x  |    x    |
|   All   |     x     |    x    |     x     |   x    |  x  |    x    |

##### test

|   tag   | predicted | matched | Precision | Recall | F1  | Support |
| :-----: | :-------: | :-----: | :-------: | :----: | :-: | :-----: |
| Surname |     x     |    x    |     x     |   x    |  x  |    x    |
|   All   |     x     |    x    |     x     |   x    |  x  |    x    |

#### 5 worst prediction(s)

|   Image name   | WER | Alignment between ground truth - prediction |
| :------------: | :-: | :-----------------------------------------: |
| <image_id>.png |  x  |                      x                      |
|                |     |                      |                      |
|                |     |                      x                      |