Skip to content

Update eval data for tests

Manon Blanco requested to merge update-eval-data-test into main

Refs #231 (closed)

If we follow the actual documentation, the teklia-dan evaluate --config configs/eval.json command gives us the following results:

| Split | CER (HTR-NER) | CER (HTR) | WER (HTR-NER) | WER (HTR) | WER (HTR no punct) |
|:-----:|:-------------:|:---------:|:-------------:|:---------:|:------------------:|
| train |     130.23    |   130.23  |     100.0     |   100.0   |       100.0        |
|  val  |     126.83    |   126.83  |     100.0     |   100.0   |       100.0        |
|  test |     112.24    |   112.24  |     100.0     |   100.0   |       100.0        |

So I updated the data to have consistent results:

  • I used a tokens.yml file,
  • I used a batch size of 1,
  • I used the prediction model/data:
    • I replace the previous tests/data/evaluate/checkpoints/best_0.pt file by the tests/data/prediction/model.pt file),
    • I wrote a labels.json file
Edited by Manon Blanco

Merge request reports

Loading