IntegrityError when restarting a task on a GPU that is already in use
Sentry Issue: ARKINDEX-BACKEND-212
UniqueViolation: duplicate key value violates unique constraint "unique_gpu_on_active_tasks"
DETAIL: Key (gpu_id)=(c0cfdc74-a9ac-ba56-4317-ee3d04afbc95) already exists.
File "django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
IntegrityError: duplicate key value violates unique constraint "unique_gpu_on_active_tasks"
DETAIL: Key (gpu_id)=(c0cfdc74-a9ac-ba56-4317-ee3d04afbc95) already exists.
(19 additional frame(s) were not displayed)
...
File "contextlib.py", line 79, in inner
return func(*args, **kwds)
File "arkindex/ponos/api.py", line 259, in create
The RestartTask
endpoint does not reset the agent_id
or gpu_id
fields when cloning the task. This means the task will always be restarted on the same agent and same GPU without going through the usual agent selection algorithm. When a GPU already has a task in a non-final state assigned, this will cause an IntegrityError
because assigning the GPU to two tasks at once is forbidden.