Clear Process.element before deleting elements in the corpus deletion task
Sentry Issue: ARKINDEX-BACKEND-1DC
-
Create a project
-
Create a process on that project, selecting any element type in the type filter, without even starting the process
-
Delete the project
ForeignKeyViolation: update or delete on table "documents_elementtype" violates foreign key constraint "dataimport_dataimpor_folder_type_id_e51648ab_fk_documents" on table "process_process"
DETAIL: Key (id)=(c8516259-fd84-4cc1-afa1-5e22d7618bba) is still referenced from table "process_process".
File "django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
IntegrityError: update or delete on table "documents_elementtype" violates foreign key constraint "dataimport_dataimpor_folder_type_id_e51648ab_fk_documents" on table "process_process"
DETAIL: Key (id)=(c8516259-fd84-4cc1-afa1-5e22d7618bba) is still referenced from table "process_process".
(6 additional frame(s) were not displayed)
...
File "django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(
File "django/db/backends/utils.py", line 80, in _execute_with_wrappers
return executor(sql, params, many, context)
File "django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
File "django/db/utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
- Cry
All elements and element types are deleted before deleting the processes when deleting a corpus, causing any process that still refers to an element type or an element through Process.folder_type
, Process.element_type
, Process.element
, Process.test_folder
, Process.validation_folder
, or Process.train_folder
to cause foreign key errors.
We can fix the ElementType-related errors by just removing element types after removing processes, but the Element-related errors (those that come from the training process foreign keys) are trickier to resolve.
If we delete elements after deleting processes, then we have to also delete WorkerRuns before deleting elements, which will cause more integrity errors as it is very likely that some elements will still refer to those WorkerRuns. We had a similar issue with Corpus.top_level_type
, where we couldn't delete the corpus before deleting the types and we couldn't delete the types before deleting the corpus.
We can solve this with a corpus.processes.update(test_folder=None, train_folder=None, validation_folder=None, element=None)
before any actual deletion happens, removing all Element-related integrity errors and allowing us to delete elements before processes.