From f4650117913580d55419b61ca9ab62fa4f5c751c Mon Sep 17 00:00:00 2001 From: manonBlanco <blanco@teklia.com> Date: Wed, 17 Jan 2024 16:00:01 +0100 Subject: [PATCH] Update documentation --- docs/usage/predict/index.md | 11 ++++------- docs/usage/train/config.md | 28 +++++++++++++++++++++++++++- 2 files changed, 31 insertions(+), 8 deletions(-) diff --git a/docs/usage/predict/index.md b/docs/usage/predict/index.md index 8654a88d..656011b1 100644 --- a/docs/usage/predict/index.md +++ b/docs/usage/predict/index.md @@ -166,14 +166,13 @@ It will create the following JSON file named after the image and a GIF showing a This example assumes that you have already [trained a language model](../train/language_model.md). -Note that: - -- the `weight` parameter defines how much weight to give to the language model. It should be set carefully (usually between 0.5 and 2.0) as it will affect the quality of the predictions. -- linebreaks are treated as spaces by language models, as a result predictions will not include linebreaks. +!!! note + - the `weight` parameter defines how much weight to give to the language model. It should be set carefully (usually between 0.5 and 2.0) as it will affect the quality of the predictions. + - linebreaks are treated as spaces by language models, as a result predictions will not include linebreaks. #### Language model at character level -First, update the `parameters.yml` file obtained during DAN training. +Update the `parameters.yml` file obtained during DAN training. ```yaml parameters: @@ -185,8 +184,6 @@ parameters: weight: 0.5 ``` -Note that the `weight` parameter defines how much weight to give to the language model. It should be set carefully (usually between 0.5 and 2.0) as it will affect the quality of the predictions. - Then, run this command: ```shell diff --git a/docs/usage/train/config.md b/docs/usage/train/config.md index ac7e1d60..8384b637 100644 --- a/docs/usage/train/config.md +++ b/docs/usage/train/config.md @@ -35,6 +35,32 @@ To determine the value to use for `dataset.max_char_prediction`, you can use the | `model.decoder.dec_dim_feedforward` | Number of dimensions for feedforward layer in transformer decoder layers. | `int` | `256` | | `model.decoder.attention_win` | Length of attention window. | `int` | `100` | +### Language model + +This assumes that you have already [trained a language model](../train/language_model.md). + +| Name | Description | Type | Default | +| ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | ------- | +| `model.lm.path` | Path to the language model. | `str` | | +| `model.lm.weight` | How much weight to give to the language model. It should be set carefully (usually between 0.5 and 2.0) as it will affect the quality of the predictions. | `float` | | + +!!! note + - linebreaks are treated as spaces by language models, as a result predictions will not include linebreaks. + +The `model.lm.path` argument expects a path to the language mode, but the parent folder should also contains: + +- a `lexicon.txt` file, +- a `tokens.txt` file. + +You should get the following tree structure: + +``` +folder/ +├── `model.lm.path` # Path to the language model +├── lexicon.txt +└── tokens.txt +``` + ## Training parameters | Name | Description | Type | Default | @@ -64,7 +90,7 @@ To determine the value to use for `dataset.max_char_prediction`, you can use the | `training.label_noise_scheduler.max_error_rate` | Maximum ratio of teacher forcing. | `float` | `0.2` | | `training.label_noise_scheduler.total_num_steps` | Number of steps before stopping teacher forcing. | `float` | `5e4` | | `training.transfer_learning.encoder` | Model to load for the encoder \[state_dict_name, checkpoint_path, learnable, strict\]. | `list` | `["encoder", "pretrained_models/dan_rimes_page.pt", True, True]` | -| `training.transfer_learning.decoder` | Model to load for the decoder \[state_dict_name, checkpoint_path, learnable, strict\]. | `list` | `["encoder", "pretrained_models/dan_rimes_page.pt", True, False]` | +| `training.transfer_learning.decoder` | Model to load for the decoder \[state_dict_name, checkpoint_path, learnable, strict\]. | `list` | `["decoder", "pretrained_models/dan_rimes_page.pt", True, False]` | - To train on several GPUs, simply set the `training.use_ddp` parameter to `True`. By default, the model will use all available GPUs. To restrict access to fewer GPUs, one can modify the `training.nb_gpu` parameter. - During the validation stage, the batch size is set to 1. This avoids problems associated with image sizes that can be very different inside batches and lead to significant padding, resulting in performance degradations. -- GitLab