Skip to content

Sort ListCorpusWorkerVersions by worker name and revision hash

Erwan Rouchet requested to merge sort-corpus-worker-versions into master

Closes #1392 (closed)

When working on #1324 (closed), I found out that a cursor-based pagination requires fields that are directly on the model and cannot support fields that are not unique and on related tables. I experimented for a while and did confirm that this requirement is not arbitrary; cursor pagination works with a WHERE field > value, so comparing worker names and revision hashes would not work well here. This can only work with something like id or created. Alternatives to the cursor pagination all rely on calling a .count(), and that is inevitable, so it is pointless to implement any custom pagination here.

I have done a whole bunch of tests in which I assigned ~14K worker versions to a corpus (corpus.worker_versions.set(WorkerVersion.objects.all())) to mess as much as possible with the pagination and select/prefetches. I played with the page size settings, even tried disabling the pagination entirely, and found barely any difference with and without sorting. The main issues addressed in #703 (closed), when the CustomCursorPagination was added to this endpoint, was various issues related to the prod database being under some temporary heavy load. The main change was to order by ID specifically to optimize everything, and there really isn't much of an option; either we handle the DB being on fire, or we sort nicely. 🤷🏻

Due to removing the CustomCursorPagination entirely, this does not depend on !1867 (merged).

Merge request reports

Loading