Skip to content
Snippets Groups Projects
tensorboard.md 3 KiB
Newer Older
# Tensorboard

DAN relies on Tensorboard to log metrics and predictions. This allows you to monitor the progress of your training.

## Access

To access Tensorboard, run `tensorboard --logdir={output_folder}/results/` in your local terminal.
Then, go on http://localhost:6006 to visualize your training.

## Metrics

Two metrics are commonly used to evaluate Automatic Text Recognition models.

- the Character Error Rate (CER) is the percentage of characters that have been transcribed incorrectly by the model.
- the Word Error Rate (WER) is the percentage of words that have been transcribed incorrectly by the model.

## Usage

Seven metrics are computed on the train and validation set and logged in Tensorboard. In addition, 5 predictions are also logged.

### Training metrics

Several metrics are computed on the training set:

- `train/{dataset}-train_loss_ce`: the cross entropy loss function.
- `train/{dataset}-train_cer`: the CER.
- `train/{dataset}-train_cer_no_token`: the CER ignoring punctuation marks.
- `train/{dataset}-train_ner`: the CER ignoring characters (only NE tokens are considered).
- `train/{dataset}-train_wer`. the WER.
- `train/{dataset}-train_wer_no_punct`: the WER ignoring punctuation marks.
- `train/{dataset}-train_wer_no_token`: the WER ignoring Named Entity (NE) tokens (only characters are considered).

These metrics can be visualized in the `Scalars` tab in Tensorboard, under the `train` section.
<img src="../../../assets/tensorboard/example_scalars_train.png" />

Alternatively, you can find them in the `Time series` tab.

### Validation metrics

The same metrics are computed on the validation set, except for the loss function:

- `dev/{dataset}-dev_cer`: the CER.
- `dev/{dataset}-dev_cer_no_token`: the CER ignoring punctuation marks.
- `dev/{dataset}-dev_ner`: the CER ignoring characters (only NE tokens are considered).
- `dev/{dataset}-dev_wer`. the WER.
- `dev/{dataset}-dev_wer_no_punct`: the WER ignoring punctuation marks.
- `dev/{dataset}-dev_wer_no_token`: the WER ignoring Named Entity (NE) tokens (only characters are considered).

These metrics can be visualized in the `Scalars` tab in Tensorboard, under the `valid` section.
<img src="../../../assets/tensorboard/example_scalars_val.png" />

Alternatively, you can find them in the `Time series` tab.

### Predictions on the validation set

Five validation images are also displayed at each epoch, along with their predicted transcription and CER and WER.
Manon Blanco's avatar
Manon Blanco committed
To log more or less images, update the `training.validation.nb_logged_images` parameter in the [configuration file](config.md). The font and its size can also be changed.

To visualize them, go in the `Image` tab in Tensorboard.

<img src="../../../assets/tensorboard/example_val.png" />

Select an image to increase its size:

<img src="../../../assets/tensorboard/example_val_step_190.png" />

By default, the slider is set to the last validation step. You can move the cursor to view previous transcriptions on the same image:

<img src="../../../assets/tensorboard/example_val_step_30.png" />