Sort ListCorpusWorkerVersions by worker name and revision hash
Closes #1392 (closed)
When working on #1324 (closed), I found out that a cursor-based pagination requires fields that are directly on the model and cannot support fields that are not unique and on related tables. I experimented for a while and did confirm that this requirement is not arbitrary; cursor pagination works with a WHERE field > value
, so comparing worker names and revision hashes would not work well here. This can only work with something like id
or created
. Alternatives to the cursor pagination all rely on calling a .count()
, and that is inevitable, so it is pointless to implement any custom pagination here.
I have done a whole bunch of tests in which I assigned ~14K worker versions to a corpus (corpus.worker_versions.set(WorkerVersion.objects.all())
) to mess as much as possible with the pagination and select/prefetches. I played with the page size settings, even tried disabling the pagination entirely, and found barely any difference with and without sorting. The main issues addressed in #703 (closed), when the CustomCursorPagination
was added to this endpoint, was various issues related to the prod database being under some temporary heavy load. The main change was to order by ID specifically to optimize everything, and there really isn't much of an option; either we handle the DB being on fire, or we sort nicely.
Due to removing the CustomCursorPagination entirely, this does not depend on !1867 (merged).