Use a worker for S3 imports
https://redmine.teklia.com/issues/6067
Just like with the init_elements
task (#1717 (closed)), the arkindex_tasks.import_s3
task is being moved to a worker.
A new INGEST_DOCKER_IMAGE
Django setting should be introduced, from docker.ingest_image
in the YAML configuration. It defaults to registry.gitlab.teklia.com/arkindex/workers/import:latest
.
A new WorkerVersion.objects.ingest_version
cached property should be introduced, which works like init_elements_version
, but with that new setting.
A new system check should call this cached property, so that it is actually cached, and should catch its errors to cause a new warning. Please update the system checks wiki page to document this new warning. If this warning appears, admins ought to expect HTTP 500 errors when trying to start an S3 import.
When starting or retrying an S3 import, the ProcessBuilder
should now do the following:
- If a
WorkerRun
for the ingest worker version does not exist:- Create a
WorkerConfiguration
on the worker of the ingest worker version, or use an existing one, which contains the following fields:-
bucket
: the name of the bucket as specified in the process; -
bucket_prefix
: the value of thesettings.INGEST_PREFIX_BY_BUCKET_NAME
setting, a boolean; -
iiif_base_url
: the URL of the ImageServer used for S3 ingest.
-
- Create a
WorkerRun
on the process, with the import worker version and that configuration.
- Create a
- Start a new task from that WorkerRun, with the
ARKINDEX_WORKER_RUN_ID
andINGEST_S3_*
environment variables to provide authentication credentials.