Optional dataset element uniqueness
https://redmine.teklia.com/issues/6264
Requires #1712 (closed)
In some datasets, but not all of them, we need to ensure that an element is only present in a single set at a time. This is a first step towards preventing data leakage. Since this does not apply to every dataset, we cannot just use a unique constraint.
A new Dataset.unique_elements boolean field should be added, defaulting to True. It should be exposed in ListCorpusDatasets and RetrieveDataset, and editable in CreateDataset, UpdateDataset and PartialUpdateDataset. This can be made visible in the Django admin, but it must not be editable there.
When this is enabled, CreateDatasetElement should return an HTTP 400 if the element is already present in another set, mentioning the set's name.
When updating this field to True using UpdateDataset or PartialUpdateDataset, an HTTP 400 error occurs if the dataset currently contains elements that are in multiple sets at once.
A data migration should ensure that unique_elements is set to False when there are elements in multiple sets at once in the existing datasets, and that every dataset that can be made unique should be made unique, since we assume that enforcing uniqueness is the preferred option for most datasets.