Cache ListCorpusWorkerVersions
Goal: cache the API results of ListCorpusWorkerVersions
, using Django low-level cache, creating a generic mixin for other API list views.
The mixin would overload ListAPIView
, and expose a method to build a cache key for given request.
The API View using that mixin would be able to define its own logic, and get out of the box:
- a check on cache as soon as possible on the
HTTP GET
request flow - if no cached version is available, call the generic workflow and save the result in the cache for the next call
The main issue lies in building smart cache keys:
- they need to be easily found to be deleted (cache invalidation from other parts of the code)
- they need to be easy & fast to build in the HTTP GET request flow
- they need to be precise, and reflect the request attribute (pagination especially)
As Django does not support a way to list cache keys given a pattern, we'll need to maintain a list of keys for a related object (cache busting DB approach). So it now becomes easy to create cache keys:
- generate a unique ID for a given request (serialize all request parameters, using url query string, path, ....)
- link that to an object in another cache key
This does not change anythong for the cache check and only adds 2 steps in the cache build:
- retrieve the current cache reference for the target
- update that cache reference
Example for /api/v1/workers/{corpus_id}/versions/?page=2
:
- cache key is
ListWorkerVersions:{corpus_id}:page_2
(or something close) - target element is corpus
corpus_id
- target cache key is
cache:corpus:{corpus_id}
- ... and it holds a pickled list including
ListCorpusWorkerVersions:{corpus_id}:page_2
When we want to bust the whole cache for that corpus:
- retrieve the cache key of the corpus
- call
delete_many
on all keys