Move the charset computation to `download`
Same spirit as #291 (closed)
The charset should be computed during the download
step, where we actually know what text we have in the training set (some images might be missing).
Same spirit as #291 (closed)
The charset should be computed during the download
step, where we actually know what text we have in the training set (some images might be missing).