Skip to content

IntegrityError when restarting a task on a GPU that is already in use

Sentry Issue: ARKINDEX-BACKEND-212

UniqueViolation: duplicate key value violates unique constraint "unique_gpu_on_active_tasks"
DETAIL:  Key (gpu_id)=(c0cfdc74-a9ac-ba56-4317-ee3d04afbc95) already exists.

  File "django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)

IntegrityError: duplicate key value violates unique constraint "unique_gpu_on_active_tasks"
DETAIL:  Key (gpu_id)=(c0cfdc74-a9ac-ba56-4317-ee3d04afbc95) already exists.

(19 additional frame(s) were not displayed)
...
  File "contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "arkindex/ponos/api.py", line 259, in create

The RestartTask endpoint does not reset the agent_id or gpu_id fields when cloning the task. This means the task will always be restarted on the same agent and same GPU without going through the usual agent selection algorithm. When a GPU already has a task in a non-final state assigned, this will cause an IntegrityError because assigning the GPU to two tasks at once is forbidden.