Skip to content

Toggle database optimisations using settings

Someone decided to schedule a client demo on a Monday morning, right after a weekend of running the long migrations from !1983 (merged) in demo and prod. Nothing could possibly go wrong. #1521 (closed) surely would not happen.

The strange choice of the query planner to use a different index than usual occurred because a VACUUM ANALYZE was run after the migrations. An ANALYZE or a VACUUM ANALYZE were required to apply !1998 (merged), and both can cause the statistics used by the query planner to change. The updated stats caused the query planner to think the path__last index to be slower, when it still is the fastest option. This behavior was only seen in demo and on my local dev setup, but not in preprod or in prod, where the path__last index was still correctly used.

The fix in !2014 (merged) forces the demo to use the GIN index, and it then combines it with the path__last index. We already use this optimisation in other places, like in Element.get_next_order, because the GIN index ensures Postgres does not just use one of the B-Tree indices like it did in #1521 (closed). But this can make some queries up to 30% slower in prod and preprod!

Since those optimisations are all dependent on the planner statistics, which depend on the data analyzed by Postgres on each instance, we need to have a way to toggle the optimisations per instance. The settings could look like this:

database:
  optimizations:
    always_use_elementpath_gin: on

When this setting is enabled, everywhere where we use a Q(path__last=element_id), we should also use a Q(path__overlap=[element_id]) or a Q(path__contains=[element_id]), since both will cause Postgres to use the GIN index. When the setting is disabled, the extra filter is not added and we only use path__overlap and path__contains when we are looking for elements recursively.

Edited by Erwan Rouchet