Skip to content

Dismantle entities

https://redmine.teklia.com/issues/11004

Requires #1919 (closed), #1920 (closed), #1921 (closed)

Model changes

  • The Entity model is removed

  • MetaData.entity_id is removed

  • MetaData.clean() and MetaData.save() are removed

  • The Transcription.entities M2M is removed

  • TranscriptionEntity.entity is removed

  • TranscriptionEntity.type is a new foreign key to EntityType. It is required, with on_delete set to DO_NOTHING to require us to write optimized queries to delete an EntityType.

  • A unique_transcription_entity constraint should ensure uniqueness on (transcription_id, type_id, offset, length, worker_run_id)

  • A data migration should update all existing TranscriptionEntity with the type of their entity

Search changes

  • The entity_* fields are renamed to transcriptionentity_*, which will require reindex --drop
  • The Indexer prefetches transcription entities through transcriptions, instead of entities through both transcriptions and metadata
  • Indexer.build_entities becomes Indexer.build_transcription_entities

API changes

  • The entity_worker_run and entity_worker_version filters are removed from ListTranscriptionEntities

  • BaseEntitySerializer is removed

  • TranscriptionEntityCreateSerializer is removed

  • TranscriptionEntitySerializer is updated:

    • entity is replaced by a read-only EntityTypeLightSerializer named type
    • A new type_id write-only field points to an EntityType, which replaces TranscriptionEntityCreateSerializer. This will replace entity with type_id in CreateTranscriptionEntity
  • entities is renamed to transcription_entities in TranscriptionEntityBulkSerializer, which affects CreateTranscriptionEntities

  • entity_id is replaced with type_id in TranscriptionEntityBulkItemSerializer, which affects CreateTranscriptionEntities

  • CreateTranscriptionEntity should ensure that no TranscriptionEntity on the same transcription has the same type, offset, length and worker run before creating

  • CreateTranscriptionEntities should:

    • ensure that no TranscriptionEntity on the same transcription has the same type, offset, length and worker run before creating
    • Return {"transcription_entity_ids": [array of UUIDs]}