Skip to content

Remove the build_entities task

The arkindex_tasks.build_entities task was used in the Transkribus import to build entities, entity roles and entity links specifically for Balsac. This would nowadays be done in a separate Balsac-specific worker. Its complex logic is often specific to the data from the Balsac project, as entities from other projects might use different annotation methods, so this is not reusable.

Please remove:

  • The task
  • The tests
  • The fixtures directory
  • The fixtures directory from MANIFEST.in
  • The dependency on Levenshtein
  • The installation of Levenshtein in .gitlab-ci.yml
  • The creation of the transcriptions.json file from the TranskribusImporter

Removing the dependency on Levenshtein will require a rebuild of the base image using git tag base-0.3.20 (or whichever version is the next version of tasks), then pushing to trigger the CI to rebuild the base image, and bumping said base image in the .gitlab-ci.yml and the Dockerfile.

Edited by Erwan Rouchet