Skip to content
Snippets Groups Projects
  • Released date
  • Created date

Release notes

  • 5386b77f Remove typesystem dep
  • a903b466 Bump Python requirement arkindex-client to 1.0.11
  • 27b26254 Bump Python requirement teklia-line-image-extractor to 0.2.7
  • b6481873 Bump Python requirement tqdm to 4.64.1
  • 406d9dc9 Remove apistar dep
  • cdd1a6a6 Bump precommit hooks
  • 54312b78 Allow filtering by direct parent's metadata
  • 8a241052 Only look for classes when we load them
  • 33f6dc16 Use new official repository for flake8 on CI
  • 7be2623c Implement metadata filtering
  • 9be8a911 Bump Python requirement jsonargparse to 4.13.2
  • 94056a19 Bump Python requirement jsonargparse to 4.13.1
  • 48d67da0 Bump Python requirement jsonargparse to 4.11.0
  • cca26d5e Export parameters json
  • 32371eab Split by page
  • 95c80791 Add basic tests for downloading and splitting
  • 1b72fc39 Bump Python requirement tqdm to 4.64.0
  • d0fd44fa Rename images with line id
  • fd06f077 Bump Python requirement teklia-line-image-extractor to 0.2.4
  • 5eb33cdf Bump Python requirement arkindex-client to 1.0.9
  • dcbea608 Update CI
  • d62f5241 Use line image extractor
  • 2f72d0a5 Revert "use cached paginate in selection"
  • 2dd8a442 use cached paginate in selection
  • 10377420 fix linting
  • aa982900 support choosing elements by arkindex selection
  • f2d4d49d fix linting
  • e86b736b fail when transcriptions contain a newline
  • 9b537f02 Add style filter (handwritten, typewritten); support ignored_classes
  • 3fbdf930 no more best_classes
  • a5a40182 choose only one transcription based on the accepted worker version ids (order is important)
  • 5d328548 add username to default cache path to avoid conflicts with other user caches
  • 32de35af fix linting
  • 7c56a81c add cached api client
  • d6cf517d Raise an exception if multiple transcriptions from the same text_line
  • 9ab6085f add skew extraction modes
  • a297c651 cache full size images to make using several extractions of the same images faster
  • 641d6c3f remove deprecated filtering by source.slug
  • 681f7ff5 Don't filter vertical lines with rotation class
  • 9f61d8c8 Polygon resize
  • f096d85d fix color arg (before always grayscale)
  • 47be8292 Use rotation class
  • 3353a171 forgot to commit main
  • a53f8f96 rename script to avoid clash with package name
  • 540884e9 Add deskew extraction
  • fcfdfcbe add sorts to make the splits reproducible
  • ada7640d use text_line by default
  • 389d8bb5 update gitignore
  • 3e6e387a add codespell ignore lines file
  • dceab604 add gitlab ci, first test
  • d1c3fc0b move kaldi_data_generator to dir
  • 447a1551 remove duplicate f.close()
  • 63e24774 fix kraken format with polygon
  • 493b10c8 use manual instead of None for manual transcriptions
  • a0c0910c applied filter by worker version commit
  • 45c9e42e Add option to skip vertical lines
  • b263ee7c integrating kraken
  • c6363c89 --dataset_name is not required for --split_only
  • 5a280bd7 use latest version of arkindex-client
  • a24a0f08 fix formatting
  • d8a643d0 add option to use existing split; add option to create a split from already downloaded lines
  • 338fda47 support new transcriptions
  • 96c2464a Filter elements according to their classes
  • 9556bdbc fix formatting
  • a21bc941 select folders by ids
  • 2b036eb1 add volume_type option
  • 48746884 set random seed
  • 2f412361 sort lines before extracting
  • aac3d0f7 add tqdm to requirements
  • c77030ab add argument to select slugs
  • f5adb3b4 add tqdm for duration estimation
  • 050824d0 fix bug if no s3_url
  • 94bd75e2 fix formatting
  • 0989a2df use logger
  • c77cdf57 use corpus id
  • 1d3e5576 Update README.md
  • f318c3d9 add types
  • a55ed1c7 clean up
  • 3e331778 update readme
  • c98795ce add option to extract polygon images
  • e050fe21 deal with negative polygons
  • e0214ac2 update readme
  • 918c600b add requirements
  • d202ed1b fix line_id bug
  • 6aee925c remove example fn
  • 9d08e020 add readme
  • 788aa635 add argparse
  • 5b8081dd add split enum
  • 967fb987 extract kaldi partition splitter
  • 4a76d32d refactor, use class
  • 830ddf72 initial commit