Do not decompose special tokens in `lexicon.txt`

When generating the lexicon file, special tokens (<ctc>, <unk>, <space>) should not be decomposed into characters. This behavior is mentioned in the comments, but the code does not do anything special for these tokens.

We end up with a lexicon file that looks like:

<ctc> < c t c >
a a
b b
c c 
<space> < s p a c e >

But we need:

for a character-base LM

<ctc> <ctc>
a a
b b
c c 
<space> <space>

for a word-based LM

<ctc> <ctc>
vendredi v e n d r e d i
huit h u i t
septembre s e p t e m b r e
<space> <space>