Use a worker for S3 imports
https://redmine.teklia.com/issues/6067
Just like with the init_elements task (#1717 (closed)), the arkindex_tasks.import_s3 task is being moved to a worker.
A new INGEST_DOCKER_IMAGE Django setting should be introduced, from docker.ingest_image in the YAML configuration. It defaults to registry.gitlab.teklia.com/arkindex/workers/import:latest.
A new WorkerVersion.objects.ingest_version cached property should be introduced, which works like init_elements_version, but with that new setting.
A new system check should call this cached property, so that it is actually cached, and should catch its errors to cause a new warning. Please update the system checks wiki page to document this new warning. If this warning appears, admins ought to expect HTTP 500 errors when trying to start an S3 import.
When starting or retrying an S3 import, the ProcessBuilder should now do the following:
- If a
WorkerRunfor the ingest worker version does not exist:- Create a
WorkerConfigurationon the worker of the ingest worker version, or use an existing one, which contains the following fields:-
bucket: the name of the bucket as specified in the process; -
bucket_prefix: the value of thesettings.INGEST_PREFIX_BY_BUCKET_NAMEsetting, a boolean; -
iiif_base_url: the URL of the ImageServer used for S3 ingest.
-
- Create a
WorkerRunon the process, with the import worker version and that configuration.
- Create a
- Start a new task from that WorkerRun, with the
ARKINDEX_WORKER_RUN_IDandINGEST_S3_*environment variables to provide authentication credentials.