Use a worker for init_elements
https://redmine.teklia.com/issues/6067
Requires arkindex/workers/init-elements#2 (closed)
As the init_elements
task for worker processes will be moved away from arkindex_tasks
to a worker, we need a way for the backend to find the correct worker version and add it automatically when starting a process.
A new INIT_ELEMENTS_DOCKER_IMAGE
setting should be introduced, holding a Docker image tag. In the YAML config, it should be docker.init_elements_image
, to match the existing docker.tasks_image
setting. It is optional, and defaults to registry.gitlab.teklia.com/arkindex/workers/init-elements:latest
. Please update the backend configuration wiki page.
A new WorkerVersion.objects.init_elements_version
cached property should be introduced, which works like imports_version
, but finds a version by Docker image with the setting. It should raise an error if no version is found, or if multiple versions are found, or if the version is not available
.
A new system check should call this cached property, so that it is actually cached, and should catch its errors to cause a new warning. Please update the system checks wiki page to document this new warning. If this warning appears, admins ought to expect HTTP 500 errors when trying to start or retry a process.
The ProcessBuilder
needs to be updated to support this new worker. It does not create a standalone initialisation task without a worker run anymore. Since we need to handle starting and retrying, since users could just add that worker randomly in a process themselves, and since this worker does not run in chunks, it gets a bit more complex.
- Put all of the process' worker runs in a separate list.
- If there is one worker run in the process that uses the the initialisation worker version and that has no
parents
, remove it from the list, and keep it separately. If there are multiple of them, just grab the first one. - If there isn't one, create it. Take all of the worker runs that had no parents, and update them so that they depend on this new worker run.
Note that since we are using a list of worker runs, you'll either need to update both with one SQL query and then edit all the attributes in Python, or update and then fetch all the worker runs again while avoiding a stale read. - Create a task for that special worker run. If the process uses chunks, there still is only one task for it.
- Create tasks for all the other worker runs without any special handling.