Skip to content

Optimize children checks on CreateElementChildren and element lists

Erwan Rouchet requested to merge optimize-has-children-tests into master

Closes #1819 (closed)

I updated both the validation query of CreateElementChildren and the query used to fetch the has_children attribute on element lists. Both of those queries suffered from performance issues, but only when operating on low or very high amounts of elements. This is because PostgreSQL shows this behavior when joining on paths to test for child element existence:

  • When Postgres thinks there is a low amount of rows (maybe a hundred), it will prefer the path__last index. If only a filter on path__overlap was set in the query, then it will not use either of the indices and will instead do a Seq Scan, the worst possible choice.

  • When Postgres thinks there is a medium amount of rows (maybe 100—10K), it will prefer the path__overlap GIN index. If only a filter on path__last was set in the query, then it will not use either of the indices and will instead do a Seq Scan, the worst possible choice.

  • When Postgres thinks there is a high amount of rows, it will always do a Seq Scan.

It looks like specifying both filters makes Postgres use just one of the two indices, and then re-check that the path matches the other condition. Some quick performance tests show this is the best option.

Merge request reports

Loading