Skip to content

Port init elements code

Refs https://redmine.teklia.com/issues/6067

Depends #3 (closed)

The goal is to port the code of tasks as a standalone worker. This cannot be an elements worker, as there is no JSON file to iterate on yet!

Please maintain the sqlite generation, even though we have work planned to remove that part.

User configuration parameters:

  • chunks: Number of chunks to split workflow into after initialisation, int, defaults to 1,
  • use_cache: Enable SQLite database generation for worker caching, bool, defaults to False,
  • sleep: Throttle API requests by waiting for a given number of seconds, int, defaults to 0.
Specification
from arkindex_worker.worker.base import BaseWorker
from arkindex_worker.cache import init_cache_db, create_version_table, create_tables
from pathlib import Path


class InitElements(BaseWorker):

    def configure(self):
        super().configure()

        # Parse user configuration

        # Use sleep value
        self.api_client.sleep_duration = self.config["sleep"]

    def init_db(self, db_path: Path):
        init_cache_db(db_path)
        create_version_table()
        create_tables()

    def fill_db(self): ...

        # Use Peewee methods instead of raw sql

    def make_chunks(self): ...

        # this should always return a list of list of dicts
        # early return if there is only one chunk

    def run(self): ...

        # Call to ListProcessElements (no helper), cast to arkindex_worker.models.Element

        # Remove dupes

        # make_chunks

        # iterate over chunks
        # foreach
        # dump json
        # dump sqlite (init_db + fill_db)
Edited by Yoann Schneider