Draft: Decode without <unk>
Some tokens rarely/never appear in the training set, so they cannot be accurately recognized. To avoid this, we use the <unk>
token to represent these tokens.
However, we need to prevent the network to predict the <unk>
token, as it is always incorrect.
Ex: https://demo.arkindex.org/element/bd15be49-8c40-47a9-a627-991ae3754209?highlight=ac2ab5b4-63ea-4916-8f65-db550a5fda58 => <unk> Mange ved jo digre si