Skip to content
Snippets Groups Projects

Dataset analysis

Description

Use the teklia-dan dataset analyze command to analyze a dataset. This will display statistics in Markdown format.

Parameter Description Type Default
--labels Path to the labels.json file. pathlib.Path
--tokens Path to the tokens.yml file. pathlib.Path
--output-file Where the summary will be saved. pathlib.Path

Examples

Display statistics for an HTR dataset

teklia-dan dataset analyze \
    --labels path/to/dataset/labels.json \
    --output-file statistics.md

Display statistics for an HTR-NER dataset

teklia-dan dataset analyze \
    --labels path/to/dataset/labels.json \
    --tokens  path/to/tokens.yml \
    --output-file statistics.md