Allow to select direct children in processes
https://redmine.teklia.com/issues/10734
The inconsistencies between the element API filters and the process filters are causing issues with new users, as what they see before creating a process is not what the process selects. Implementing a full match between those APIs and ListProcessElements is not easy, as performance is crucial in this endpoint, and it works and is used differently. We can avoid the most common error by allowing child elements to be selected non-recursively, as running your process on every page in a folder is among the most common use cases.
Process.load_children should be redefined as an EnumField. The new ProcessChildrenOption enum includes:
-
none, load no children (default) -
direct, only load direct children, non-recursively -
all, load all children recursively
The migration should use a RunSQL with state_operations to have the AlterField use these queries:
ALTER TABLE process_process ALTER COLUMN load_children TYPE varchar(10) USING CASE WHEN load_children THEN 'all' ELSE 'none' END;- Reverse: `ALTER TABLE process_process ALTER COLUMN load_children TYPE boolean USING load_children <> 'none';
This will perform the data migration directly. load_children=False becomes none, and True becomes all.
Process.list_elements needs to be updated to support these new options:
-
On processes whose modes are not
WorkersorExport, return an empty queryset -
If there is one element assigned as
Process.element:- Build an initial
Qfilter that only looks for this element by ID - If
load_childrenis notnone:- Add a
| Q(paths__path__overlap=[element_id])to include children - If
load_childrenisdirect, also addpaths__path__last=element_idto restrict to direct children
- Add a
- Build an initial
-
Try to list the
idandcorpus_idfound inProcess.elements.all(). If the list is not empty:- Fail if any
corpus_idis not the corpus ID of the process - Build the initial
Qfilter on the list ofid - If
load_childrenis notnone:- Add a
| Q(paths__path__overlap=element_ids)to include children - If
load_childrenisdirect, also addpaths__path__last__in=element_idto restrict to direct children
- Add a
- Fail if any
-
Otherwise the process is running on the whole corpus:
- When
load_childrenisnone, return an empty queryset - With
direct, return all elements in the corpus withpaths__path=[] - With
all, return all elements in the corpus
- When
-
On all queries, the filters returned by
_get_filters()should be applied, so that the name/type/class filters continue to work.
Performance tests are necessary because multiple options can have issues:
- The ElementPath filters are using
ORand might need to use aUNIONinstead, as we know this can be an issue - We already know that running a process from a large selection can have issues as we are loading every element in RAM, and this might not be avoidable
-
paths__path__last__in=[…]might not be using an index at all -
paths__path__overlapalone without alastfilter is sometimes slower becauselastcan use a faster B-tree index