Skip to content

Missing database check constraint and serializer validation on available worker versions

Sentry Issue: ARKINDEX-BACKEND-44

AttributeError: 'NoneType' object has no attribute 'id'
(7 additional frame(s) were not displayed)
...
  File "rest_framework/views.py", line 502, in dispatch
    response = handler(request, *args, **kwargs)
  File "arkindex/dataimport/api.py", line 344, in post
    dataimport.start()
  File "arkindex/dataimport/models.py", line 240, in start
    self.workflow = self.build_workflow(ml_tools, chunks, thumbnails)
  File "arkindex/dataimport/models.py", line 227, in build_workflow
    tasks[f'{worker_run.version.slug}'] = worker_run.build_task_recipe(import_task_name, elements_path)
  File "arkindex/dataimport/models.py", line 477, in build_task_recipe
    'artifact': str(self.version.docker_image.id),

This error is caused by a WorkerVersion in an available state not having a docker_image (FK to Artifact). It seems there is no validation on the API endpoints, serializers or in the database that available worker versions must have an artifact, and it is possible to implement in on the database side using a CheckConstraint, without even affecting the build task, as it already does a single API call to send available along with the artifact ID. The exact reason why an available worker lost its artifact (although it did get built in this process) could not be determined, so adding extra validation should help us find out.

Since there already is invalid data in production, the migration should include a RunPython to update existing WorkerVersions: if they are available and have no artifact, give them an error state.

Edited by Erwan Rouchet