Update documentation

f4650117 · Manon Blanco · b7e519cb · f4650117 · f4650117
Commit f4650117 authored 1 year ago by Manon Blanco
--- a/docs/usage/predict/index.md
+++ b/docs/usage/predict/index.md
@@ -166,14 +166,13 @@ It will create the following JSON file named after the image and a GIF showing a

 This example assumes that you have already [trained a language model](../train/language_model.md).

-Note that:
-
- the `weight` parameter defines how much weight to give to the language model. It should be set carefully (usually between 0.5 and 2.0) as it will affect the quality of the predictions.
- linebreaks are treated as spaces by language models, as a result predictions will not include linebreaks.
+!!! note
+    - the `weight` parameter defines how much weight to give to the language model. It should be set carefully (usually between 0.5 and 2.0) as it will affect the quality of the predictions.
+    - linebreaks are treated as spaces by language models, as a result predictions will not include linebreaks.

 #### Language model at character level

-First, update the `parameters.yml` file obtained during DAN training.
+Update the `parameters.yml` file obtained during DAN training.

 ```yaml
 parameters:
@@ -185,8 +184,6 @@ parameters:
    weight: 0.5
 ```

-Note that the `weight` parameter defines how much weight to give to the language model. It should be set carefully (usually between 0.5 and 2.0) as it will affect the quality of the predictions.
-
 Then, run this command:

 ```shell

--- a/docs/usage/train/config.md
+++ b/docs/usage/train/config.md
@@ -35,6 +35,32 @@ To determine the value to use for `dataset.max_char_prediction`, you can use the
 | `model.decoder.dec_dim_feedforward` | Number of dimensions for feedforward layer in transformer decoder layers.          | `int`   | `256`   |
 | `model.decoder.attention_win`       | Length of attention window.                                                        | `int`   | `100`   |

+### Language model
+
+This assumes that you have already [trained a language model](../train/language_model.md).
+
+| Name              | Description                                                                                                                                               | Type    | Default |
+| ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | ------- |
+| `model.lm.path`   | Path to the language model.                                                                                                                               | `str`   |         |
+| `model.lm.weight` | How much weight to give to the language model. It should be set carefully (usually between 0.5 and 2.0) as it will affect the quality of the predictions. | `float` |         |
+
+!!! note
+    - linebreaks are treated as spaces by language models, as a result predictions will not include linebreaks.
+
+The `model.lm.path` argument expects a path to the language mode, but the parent folder should also contains:
+
+- a `lexicon.txt` file,
+- a `tokens.txt` file.
+
+You should get the following tree structure:
+
+```
+folder/
+├── `model.lm.path` # Path to the language model
+├── lexicon.txt
+└── tokens.txt
+```
+
 ## Training parameters

 | Name                                             | Description                                                                                                                 | Type         | Default                                                                     |
@@ -64,7 +90,7 @@ To determine the value to use for `dataset.max_char_prediction`, you can use the
 | `training.label_noise_scheduler.max_error_rate`  | Maximum ratio of teacher forcing.                                                                                           | `float`      | `0.2`                                                                       |
 | `training.label_noise_scheduler.total_num_steps` | Number of steps before stopping teacher forcing.                                                                            | `float`      | `5e4`                                                                       |
 | `training.transfer_learning.encoder`             | Model to load for the encoder \[state_dict_name, checkpoint_path, learnable, strict\].                                      | `list`       | `["encoder", "pretrained_models/dan_rimes_page.pt", True, True]`            |
-| `training.transfer_learning.decoder`             | Model to load for the decoder \[state_dict_name, checkpoint_path, learnable, strict\].                                      | `list`       | `["encoder", "pretrained_models/dan_rimes_page.pt", True, False]`           |
+| `training.transfer_learning.decoder`             | Model to load for the decoder \[state_dict_name, checkpoint_path, learnable, strict\].                                      | `list`       | `["decoder", "pretrained_models/dan_rimes_page.pt", True, False]`           |

 - To train on several GPUs, simply set the `training.use_ddp` parameter to `True`. By default, the model will use all available GPUs. To restrict access to fewer GPUs, one can modify the `training.nb_gpu` parameter.
 - During the validation stage, the batch size is set to 1. This avoids problems associated with image sizes that can be very different inside batches and lead to significant padding, resulting in performance degradations.