diff --git a/docs/contents/implem/configure.md b/docs/contents/implem/configure.md new file mode 100644 index 0000000000000000000000000000000000000000..2064ec2c2e78654e1c5633d1ad439c16656c6ec0 --- /dev/null +++ b/docs/contents/implem/configure.md @@ -0,0 +1,48 @@ +# Configuration + +When the worker is running over elements, be it locally or on Arkindex, the first step before actually doing +anything is configuration. This process is implemented in the `configure` method that a worker +inherits from [ElementsWorker][elements_worker] and [BaseWorker][base_worker]. +This method can also be overloaded if the worker needs additional configuration steps. + +The developer mode was designed to help worker developers reproduce and test how their worker +would behave on Arkindex. This is why the configuration process in this mode mirrors the operation while +replacing API calls by CLI arguments. + +The developer mode is enabled when at least one of three events occur: +- the `--dev` CLI arguments is used, +- the `WORKER_VERSION_ID` variable was not set in the environment, +- the `ARKINDEX_WORKER_RUN` variable was not set in the environment. + +None of these happen when running on Arkindex. + + +## Developer mode +- The worker's configuration YAML with variable needed by the worker +contains also the list of secrets needed by the worker. See [secrets][]. + +- ARKINDEX_CORPUS_ID to specify which corpus the processed elements belong to +- Local secrets loading + +- DEBUG mode +When implementing a new worker, some additional logs might be needed to properly investigate +why something is not working as intended. The logging level can be set to the `DEBUG` level via either +- the `--verbose` CLI arguments, +- setting the `ARKINDEX_DEBUG` to `True` in the environment, +- specifying the `"debug": True` in the worker's configuration via the `user_configuration`. + For more information, see [how to use the user_configuration][user-config]. + + +## Arkindex mode + +- DEBUG mode +- RetrieveWorkerRun, what is a worker run. link to arkindex api ? what information does it give +- user_configuration loading + reading default values and storing them in the config +- secrets actual loading +- overriding the config with worker's configuration + + +[elements_worker]: /../../../ref/elements_worker#elements-worker +[base_worker]: /../../ref/base_worker#base-worker +[user-config]: /../workers/yaml.md#setting-up-user-configurable-parameters diff --git a/docs/contents/implem/index.md b/docs/contents/implem/index.md new file mode 100644 index 0000000000000000000000000000000000000000..5a195bb259c428c2d5d2a10965c26dd24696274e --- /dev/null +++ b/docs/contents/implem/index.md @@ -0,0 +1,2 @@ +# Worker Implementation + diff --git a/mkdocs.yml b/mkdocs.yml index 7c220ff11a7dde6087752584c193693c8e7ce68a..45b2e97d5f77325bd565b860bf30680063c54e8e 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -70,6 +70,9 @@ nav: - Using secrets in workers: - contents/secrets/index.md - Usage: contents/secrets/usage.md + - Worker Implementation: + - contents/implem/index.md + - Configuration: contents/implem/configure.md - Python Reference: - Base Worker: ref/base_worker.md - Elements Worker: ref/elements_worker.md