Create fake worker runs for ML results with only worker versions
https://redmine.teklia.com/issues/10463
We still have both worker_version and worker_run foreign keys on elements, classifications, transcriptions, metadata, entities, and transcription entities, where having a worker_run implies having a worker_version set, but a worker_version can still be set without any worker_run, to allow older ML results to still exist.
To remove the extra worker_version FK without losing any information, we need to set a worker_run on all ML results that have a worker_version but no worker_run.
A new migration should:
- List all
(corpus_id, worker_version_id)sets for all 6 tables of ML results, where theworker_version_idis set but there is noworker_run_id. - If nothing has been found, end here.
- Bulk create Workers processes on each corpus named
Migration of ML results without worker runs. - Bulk create WorkerRuns within these processes for all the listed worker versions.
- Update all ML results to set their
worker_run_idto the newly created worker runs. It will probably be fastest to do one update for each(corpus_id, worker_version_id)set since this should be able to use indexes for fast access.
Finally, all *_worker_run_requires_worker_version constraints should be renamed *_worker_run_and_worker_version, and should now require that worker_run_id IS NULL = worker_version_id IS NULL. This will ensure that you can either set both the version and run, or neither of them, removing the case where there is a version and no run.
These migrations will need to be tested on a rather large database as there is a high chance of hitting performance issues.