Skip to content

Ensure child path unicity in add_parent when there are multiple parents

Erwan Rouchet requested to merge unique-child-paths into master

Closes #778 (closed)

Let's say you want to perform this operation:

Before After
@startmindmap
+ Le Element
++ Le Child
+++ Le Grandchild
-- Parent 1
-- Parent 2
@endmindmap
@startmindmap
+ Le Element
++ Le Child
+++ Le Grandchild
-- Parent 1
-- Parent 2
-- Parent 3
@endmindmap

The initial contents of ElementPath will be like so (the ordering column is ignored here as everything will be 0):

Element Path
Le Element Parent 1
Le Element Parent 2
Le Child Parent 1, Le Element
Le Child Parent 2, Le Element
Le Grandchild Parent 1, Le Element, Le Child
Le Grandchild Parent 2, Le Element, Le Child

After just adding the Parent 3 path to Le Element, Element.add_parent handles child elements like so:

  • For each child element path
    • If the path starts with Le Element, nuke it (it will be replaced by one with Le Element + the new parent)
    • Else:
      • Strip off the other parent (remove everything before Le Element in the path)
      • For each of the new parents of the current element (Parent 1, 2, 3):
        • Create a new path with the parents + Le Element + whatever was after Le Element in the original path.

In the above example, this means the grandchild's paths will be edited like so:

Path Strip the other parent Add our parents
Parent 1, Le Element, Le Child Le Element, Le Child Parent 1, Le Element, Le Child
Parent 2, Le Element, Le Child
Parent 3, Le Element, Le Child
Parent 2, Le Element, Le Child Le Element, Le Child Parent 1, Le Element, Le Child
Parent 2, Le Element, Le Child
Parent 3, Le Element, Le Child

And there we go, we just duplicated every path!

This issue does not occur when adding the first or the second parents, only when we are adding a third, fourth, … parent, to an element that has a child. This is a rather rare case, but it might be the reason why we had such strange structures in corpora that used datasets or where elements were regularly moved around. We also did not cover this case in unit tests, so I added a case that goes up to 50 parents; we already know going over 6 parents will cause issues in the frontend anyway (requests#1).

This merge request fixes the issue the hardcore way by using a set, which really ensures the entire algorithm, no matter how flawed, cannot create duplicate paths. Maybe, one day, I will be granted the right to finally work on this issue properly and turn everything into SQL or PL/SQL… 🙏

Edited by Erwan Rouchet

Merge request reports

Loading