Skip to content
Snippets Groups Projects
Commit cfc54b7a authored by Solene Tarride's avatar Solene Tarride Committed by Mélodie Boillet
Browse files

Update doc with `--discount_fallback` option for LM

parent 82d7946e
No related branches found
No related tags found
1 merge request!304Update doc with `--discount_fallback` option for LM
......@@ -20,9 +20,12 @@ At character-level, we recommend building a 6-gram model. Use the following comm
```sh
bin/lmplz --order 6 \
--text my_dataset/language_model/corpus_characters.txt \
--arpa my_dataset/language_model/model_characters.arpa
--arpa my_dataset/language_model/model_characters.arpa \
--discount_fallback
```
Note that the `--discount_fallback` option can be removed if your corpus is very large.
The following message should be displayed if the language model was built successfully:
```sh
......@@ -62,16 +65,19 @@ Chain sizes: 1:1308 2:27744 3:159140 4:412536 5:717920 6:1028896
Name:lmplz VmPeak:12643224 kB VmRSS:6344 kB RSSMax:1969316 kB user:0.196445 sys:0.514686 CPU:0.711161 real:0.682693
```
### Subord-level
### Subword-level
At subword-level, we recommend building a 6-gram model. Use the following command:
```sh
bin/lmplz --order 6 \
--text my_dataset/language_model/corpus_subwords.txt \
--arpa my_dataset/language_model/model_subwords.arpa
--arpa my_dataset/language_model/model_subwords.arpa \
--discount_fallback
```
Note that the `--discount_fallback` option can be removed if your corpus is very large.
### Word-level
At word-level, we recommend building a 3-gram model. Use the following command:
......@@ -79,9 +85,12 @@ At word-level, we recommend building a 3-gram model. Use the following command:
```sh
bin/lmplz --order 3 \
--text my_dataset/language_model/corpus_words.txt \
--arpa my_dataset/language_model/model_words.arpa
--arpa my_dataset/language_model/model_words.arpa \
--discount_fallback
```
Note that the `--discount_fallback` option can be removed if your corpus is very large.
## Predict with a language model
See the [dedicated example](../predict/index.md#predict-with-an-external-n-gram-language-model).
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment