Move the unknown token replacement step to download
Closes #291 (closed)
I was able to extract a real dataset using the same commands used in https://redmine.teklia.com/issues/7433#note-23
teklia-dan dataset entities \
/home/training_data/ATR_page/Heritus/heritus-20240619-073640.sqlite
teklia-dan dataset tokens \
./entities.yml
teklia-dan dataset extract \
/home/training_data/ATR_page/Heritus/heritus-20240619-073640.sqlite \
--dataset-id fc6d4c69-ca18-4f39-8479-b891cc93bd29 \
--element-type double_page \
--output . \
--tokens ./tokens.yml \
--transcription-worker-runs 3947a85e-0661-42bc-8d71-4852adb94375 54e9034a-0ce9-43ab-8f12-2885fcc83842 ad2e5af5-a61a-4ade-827a-0129b0cfa493 \
--entity-worker-runs 3947a85e-0661-42bc-8d71-4852adb94375 54e9034a-0ce9-43ab-8f12-2885fcc83842 ad2e5af5-a61a-4ade-827a-0129b0cfa493
# Merge dataset (override with Callico transcriptions)
teklia-dan dataset extract \
/home/training_data/ATR_page/Heritus/heritus-20240619-073640.sqlite \
--dataset-id 10f85201-3a7c-4190-806a-ae5452503280 \
--element-type double_page \
--output . \
--tokens ./tokens.yml \
--transcription-worker-runs ef07caeb-191c-4609-aee7-72308a7201ab \
--entity-worker-runs ef07caeb-191c-4609-aee7-72308a7201ab
teklia-dan dataset download \
--output . \
--tokens ./tokens.yml \
--max-width 1800
teklia-dan dataset analyze \
--labels ./labels.json \
--tokens ./tokens.yml \
--output-file ./analyze.md
Edited by Manon Blanco