Implement data generation

the worker should work on a corpus (from process config), find its last available corpus export
iterate over all filtered elements from the process
extract all informations about this element and ALL its children (transcriptions, metas, entities)
store them in a cache sqlite database compatible with our existing workers
download all images used by these elements
build a tar+zstd archive with the resulting payload