CreateElementChildren endpoint
https://redmine.teklia.com/issues/8049
A new endpoint CreateElementChildren
will be added to Arkindex to add multiple elements as children of a single parent at once.
URL: /api/v1/elements/{id}/children/
(same as ListElementChildren
& DestroyElementChildren
ideally, if too complex use /api/v1/elements/{id}/children/bulk
to match other bulk endpoints)
Method: POST
Body:
{
"children": [
"<UUID1>",
"<UUID2>",
"<UUID3>"
],
"worker_run_id": "<worker_run_id>"
}
Parameters:
- Parent element ID as
uuid
from path - List of child elements to link to the parent in JSON body
- WorkerRun ID in JSON body
Checks:
- Usual checks on worker runs (existence + belongs to user or user has access to it). Feasible with the
WorkerRunIDField
. - parent & children must
- all exist
- be set in the same corpus
- The child elements must not have any grandchildren as this is unsupported; this check is critical as the situation would become incomprehensible afterwards. This must be documented in the API.
- Adding the new parent should not create any cycle (none of the child IDs must be contained in any of the paths)
Workflow:
- Load all existing paths for the parent (1 query on
ElementPath
) - Extend them by adding the parent (in memory, no request)
- Load all max orderings for extended paths (1 query on
ElementPath
) - Index orderings by path (using a hash method on path, in memory, no request)
- Instanciate a new
ElementPath
for each of these paths + children, using orderings & offset from children (in memory, no request) - UPSERT these ElementPaths, using the ordering as the only field to update in case of conflicts
The UPSERT
is the magic part here, we'll only update the ordering on conflicts: this means a child already present will get its position updated at the end of the stack. This should also be documented in the API.
POC
This will need to be implemented in a serializer's validate
and create
methods while avoiding stale reads and N+1 queries.
from arkindex.documents.models import Element, ElementPath
import uuid
from django.db.models import Max
import hashlib
parent = Element.objects.get(pk='948c2bef-4076-45a9-b2e7-c5439c33d553')
children = Element.objects.filter(id__in=["0b584d49-9f38-4bdf-841a-80574b739e46", "83dbf23d-6ad1-4b88-9270-644e5cbc403b", "945c3c7f-c0d5-4056-920b-b393f6d05aeb", "12dc9968-6587-4b4b-82a4-0008f19b594a"])
# Checks
for child in children:
assert parent.corpus == child.corpus
# Load paths
paths = parent.paths.all().values_list('path', flat=True)
# Extend them
paths = [
p + [parent.id]
for p in paths
]
# Method to index path
def hash_path(path):
h = hashlib.md5()
for element in path:
assert isinstance(element, uuid.UUID)
h.update(element.bytes)
return h.hexdigest()
# Load their max orderings
# This will be empty at first (no children), so we cannot iterate directly on orderings
orderings = {
hash_path(p["path"]): p["ordering"]
for p in ElementPath.objects.filter(path__in=paths).annotate(max_ordering=Max('ordering')).values("path", "ordering")
}
# Create new paths
new_paths = [
ElementPath(
path=path,
element_id=child.id,
ordering=orderings.get(hash_path(path), 0) + offset
)
for offset, child in enumerate(children) # Note: this is the only place where we should respect input otder
for path in paths
]
# Bulk UPSERT
ElementPath.objects.bulk_create(new_paths, update_conflicts=True, update_fields=("ordering", ), unique_fields=("path", "element_id"))