Generic Pylaia worker

While the training worker is being implemented in #8 (closed) , we can already imagine the Pylaia generic worker.

This worker's user configuration parameters will at least include:

use_language_model, bool, defaults to False, whether a language model is required (will fail if looking for one and not finding)
from_right_to_left, bool, defaults to False, use Right-to-left orientation
extraction_mode, enum, (see current param)
batch_size, int, defaults to 2
line_element_type, str
line_worker_version_id, str
scale_x, float
scale_y_top, float
scale_y_bottom, float
color_mode (old image_convert), enum
lm_weight, float

The current models have to be ported in the right format via the CLI, an archive containing

model
syms.txt
weights.ckpt
language_model.arpa.gz (optional)
lexicon.txt (optional)
tokens.txt (optional)

We need to do something similar to U-FCN generic worker:

load the model version configuration using self.model_configuration
load the model and optional language model in the right folder
- if "model" is in config, keep the current behavior
- else, find the path to the model using self.find_model_directory() and retrieve what's available there (language model if self.config[use_language_model] is True)

Just like U-FCN generic worker, we need a new CI job that publishes the model on https://arkindex.teklia.com. There is no need to rename the folder though in this case but we need the --use-parent-folder option to publish the whole folder instead of only the model binary file.

Edited Oct 17, 2022 by Yoann Schneider

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information