diff --git a/content/secrets/workers.md b/content/secrets/workers.md index e719c2bf469eff0d4bfb9d7c824507bb5ecd65a7..e659585d86ca3e45979b818406f5ad823cf7c042 100644 --- a/content/secrets/workers.md +++ b/content/secrets/workers.md @@ -1,56 +1,7 @@ +++ +template = "redirect_page.html" title = "Using secrets in workers" weight = 30 +[extra] +redirect_url = "https://workers.arkindex.org/contents/secrets/usage/" +++ - -## Accessing secrets in the API - -Since Arkindex 0.14.2, an API endpoint is available to retrieve secrets called -[RetrieveSecret][api]. The endpoint cannot be accessed by regular users, but -machine learning workers and administrators can use it. - -## Declaring secrets in workers - -Declaring a secret in your worker allows our [base worker package][base-worker] -to retrieve the secret for you, and causes it to fail when the secret is -missing from the Arkindex instance. - -### To declare a secret - -1. Add the secret's name to the `secrets` section of a worker in the - `.arkindex.yml` file: - - ```yaml - --- - version: 2 - type: worker - - workers: - - slug: my_worker - name: My nice worker - docker: - build: Dockerfile - configuration: - threshold: 21.3 - - # Declare your secrets as below, only specifying their name - secrets: - - project/tool/credentials.json - ``` - -For more information on the `.arkindex.yml` file, -see [YAML configuration](@/workers/yaml.md). - -## Accessing secrets in Python code - -Declared secrets will be made available to `Worker` classes as the -`self.secrets` attribute, a Python `dict` mapping secret names to -unencrypted secret content. - -### To access a secret in Python code - -1. Anywhere in the code, use `self.secrets["my_secret_name"]`, - where `my_secret_name` is the name of the secret. - -[base-worker]: http://pypi.org/pypi/arkindex-base-worker -[api]: https://arkindex.teklia.com/api-docs/#operation/RetrieveSecret diff --git a/content/workers/_index.md b/content/workers/_index.md index 5b7db16152be1a1321e6444aedef929d85a991ec..3966a8064e03305caebaff46763867d478dec638 100644 --- a/content/workers/_index.md +++ b/content/workers/_index.md @@ -1,32 +1,15 @@ +++ +# Redirect to the Base-Worker documentation. +# We should be able to use the built-in `redirect_to` option, +# but it does not allow external URLs so we use a custom template. +# https://github.com/getzola/zola/issues/1844 +template = "redirect_section.html" + title = "Workers" sort_by = "weight" weight = 40 insert_anchor_links = "right" -+++ - -Arkindex has a powerful system to run asynchronous tasks. Those are based on -Docker images, and can do about anything (ML processing, but also to import -data into Arkindex, export from Arkindex to another system or file format...) - -This section consists of the following guides: -* [Setting up a new worker](@/workers/create.md) -<!-- -TODO: * [Worker implementation guidelines](@/workers/implement.md) -https://gitlab.com/teklia/arkindex/doc/-/issues/17 ---> -* [Running your worker locally](@/workers/run-local.md) -<!-- -TODO: * [Running your worker on Arkindex](@/workers/run-arkindex.md) -https://gitlab.com/teklia/arkindex/doc/-/issues/18 -TODO: * [Writing tests for your worker](@/workers/tests.md) -https://gitlab.com/teklia/arkindex/doc/-/issues/16 ---> -* [Maintaining a worker](@/workers/maintenance.md) - -This section consists of the following references: - -* [GitLab Continuous Integration for workers](@/workers/ci/index.md) -* [YAML configuration](@/workers/yaml.md) -* [Worker template structure](@/workers/template-structure.md) +[extra] +redirect_url = "https://workers.arkindex.org/contents/workers/" ++++ diff --git a/content/workers/ci/index.md b/content/workers/ci/index.md index 988659ecf58548541f40360c9e73a2b6d75e5fa6..c25a4d6ef8fc8bd0bf460d5e6f3cc379a09f99f4 100644 --- a/content/workers/ci/index.md +++ b/content/workers/ci/index.md @@ -1,101 +1,7 @@ +++ +template = "redirect_page.html" title = "GitLab Continuous Integration for workers" weight = 70 +[extra] +redirect_url = "https://workers.arkindex.org/contents/workers/ci/" +++ - -This page describes how continuous integration (CI) is used in workers created -using the `base-worker` template. - -For more information on creating workers, see -[Setting up a worker](@/workers/create.md). - -## Default template - -When creating a worker with our official template, a `.gitlab-ci.yml` file has -been included with a few actions that will run on every push you make. - -The CI jobs will run in the following order: - -<figure> - -  - - <figcaption>CI pipeline execution order</figcaption> -</figure> - -## Git Flow - -At Teklia, we use a simple version of [Git Flow][gitflow]: - -* The `master` branch should always have validated code and should be deployable - in production at any time. -* Developments should happen in branches, with merge requests to enable code - review and Gitlab CI pipelines. -* Project maintainers should use Git tags to create official releases, by - updating the `VERSION` file and using the same version string as the tag name. - -This process is reflected the template's `.gitlab-ci.yml` file. - -## Linting - -The `lint` job uses [pre-commit][pre-commit] to run source code linters on your -project and validate various rules: - -* Checking your Python code is PEP8 compliant -* Auto-formatting your Python code using [black][black] -* Sort your Python imports -* Check you don't have any trailing white space -* Check your YAML files are well formatted -* Fix some common spelling errors - -You can set up pre-commit to run locally too; see -[Activating the pre-commit hook](@/workers/create.md#activating-the-pre-commit-hook). - -## Testing - -The `test` job uses [tox][tox] and [pytest][pytest] modules to run written unit -tests for your repository and avoid any kind of code regression. - -Any unit test you have added to your project will be executed on each git push, -allowing you to check the validity of your code before merging it. - -Unit tests allow you to prevent regressions in your code when making changes, -and find bugs before they make their way into production. - -<!-- TODO: -For more information, see [Writing unit tests for your worker](@/workers/tests.md). -https://gitlab.com/teklia/arkindex/doc/-/issues/16 ---> - -## Building - -When the `test` & `lint` jobs run successfully, the `docker` job runs. It will -try to build a docker image from your `Dockerfile`. This will check that your -`Dockerfile` is valid and builds an image successfully. - -This build step is only used as a check, as Arkindex builds Docker images on -its own. - -## Generating release notes - -When the `docker` job is successful and the CI pipeline is running for a Git -tag, the `release-notes` job runs. It will list all the commits since the -previous tag and aggregate them to publish release notes on the GitLab project. - -We provide an [open source docker image](gitlab.com/teklia/devops/) to build these release notes, -but you'll need to provide your own Gitlab access token so that the task can -publish release notes on your own repository. - -You can generate an access token on the Gitlab's page [User Settings > Access Tokens](https://gitlab.com/-/profile/personal_access_tokens), with `api` scope. - -The token must then be set as a CI Variable on your Gitlab project: -1. go to your project settings, -2. go to section **CI / CD** -3. click on `Expand` in the **Variables** section -4. add a new variable named `DEVOPS_GITLAB_TOKEN` whose value is your token - -[gitflow]: https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow -[pre-commit]: https://pre-commit.com/ -[black]: https://github.com/psf/black -[tox]: https://tox.readthedocs.io/ -[pytest]: https://docs.pytest.org/ diff --git a/content/workers/create.md b/content/workers/create.md index 30363604934aedd41c1a4cea1058465fd618d481..4d04c57d4a361363a9b5e863490efa5fd05a56b1 100644 --- a/content/workers/create.md +++ b/content/workers/create.md @@ -1,231 +1,8 @@ +++ +template = "redirect_page.html" title = "Create your own worker" weight = 10 -+++ - -This page will guide you through creating a new Arkindex worker locally and -preparing a development environment. - -This guide assumes you are using Ubuntu 18.04 or later and have root access. - -## Preparing your environment - -This section will guide you through preparing your system to create a new -Arkindex worker from our [official template][base-worker]. - -### Installing system dependencies - -To retrieve the Arkindex worker template, you will need to have both Git and -SSH. Git is a version control system that you will later use to manage multiple -versions of your worker. SSH allows secure connections to remote machines, and -will be used in our case to retrieve the template from a Git server. - -#### To install system dependencies - -1. Run the following command: - - ``` - sudo apt install git ssh - ``` - -### Checking your version of Python - -Our Arkindex worker template requires Python 3.6 or later. Checking if a -compatible version of Python is installed avoids further issues in the setup -process. - -#### To check your version of Python - -1. Run the following command: `python3 --version` - -This command will have an output similar to the following: - -``` -Python 3.6.9 -``` - -### Installing Python - -If you were unable to check your Python version as stated above because -`python3` was not found, you will need to install Python 3 on your system. - -#### To install Python on Ubuntu - -1. Run the following command: - - ``` - sudo apt install python3 python3-pip python3-virtualenv - ``` - -2. Check your Python version again, as instructed in the previous section. - -### Installing Python dependencies - -To bootstrap a new Arkindex worker, some Python dependencies will be required: - -* [pre-commit][pre-commit] will be used to automatically check the - syntax of your source code. -* [tox][tox] will be used to run unit tests. -<!-- -TODO: Link to [unit tests](@/workers/tests.md) -https://gitlab.com/teklia/arkindex/doc/-/issues/16 ---> -* [cookiecutter][cookiecutter] will be used to bootstrap the project. -* [virtualenvwrapper][virtualenvwrapper] will be used to manage Python virtual - environments. - -#### To install Python dependencies - -1. Run the following command: - - ``` - pip3 install pre-commit tox cookiecutter virtualenvwrapper - ``` - -2. Follow the - [official virtualenvwrapper setup instructions][virtualenvwrapper-setup] - until you are able to run `workon`. - -`workon` should have an empty output, as no Python virtual environments have -been set up yet. - -## Creating the project - -This section will guide you through creating a new worker from our official -template and making it available on a GitLab instance. - -### Creating a GitLab project - -For a worker to be accessible from an Arkindex instance, it needs to be sent -to a repository on a GitLab project. A GitLab project will also allow you to -manage different versions of a worker and run -[automated checks](@/workers/ci/index.md) on your code. - -#### To create a GitLab project - -1. Open the **New project** form [on GitLab.com](https://gitlab.com/projects/new) - or on another GitLab instance - -2. Enter your worker name as the **Project name** - -3. Define a **Project slug** related to your worker, e.g.: - - * `tesseract` for a Tesseract worker - * `opencv-foo` for an OpenCV worker related to project Foo - -4. Click on the **Create project** button -### Bootstrapping the project - -This section guides you through using our [official template][base-worker] -to get a basic structure for your worker. - -#### To bootstrap the project - -1. Open a terminal and go to a folder in which you will want your worker to be. -2. Enter this command and fill in the required information: - - ``` - cookiecutter git@gitlab.com:teklia/workers/base-worker.git - ``` - -Cookiecutter will ask you for several options: - -- `slug`: A slug for the worker. This should use lowercase alphanumeric characters or - underscores to meet the code formatting requirements that the template - automatically enforces via [black][black]. - -- `worker_type`: An arbitrary string purely used for display purposes. - For example: - - - `recognizer`, - - - `classifier`, - - - `dla`, - - - `entity-recognizer`, etc. - -- `author`: A name for the worker's author. Usually your first and last name. - -- `email`: Your e-mail address. This will be used to contact you if any administrative need arise - -### Pushing to GitLab - -This section guides you through pushing the newly created worker from your -system to the GitLab project's repository. - -This section assumes you have Maintainer or Owner access to the GitLab project. - -#### To push to GitLab - -1. Enter the newly created directory, starting in `worker-` and ending with your - worker's slug. - -2. Add your GitLab project as a Git remote: - - ``` - git remote add origin git@my-gitlab-instance.com:path/to/worker.git - ``` - - You will need to use your own instance's URL and the path to your own - project. For example, a project named `hello` in the `teklia` group - on `gitlab.com` will use the following command: - - ``` - git remote add origin git@gitlab.com:teklia/hello.git - ``` - -3. Push the new branch to GitLab: - - ``` - git push --set-upstream origin master - ``` - -4. Open your GitLab project in a browser. - -5. Click on the blue icon indicating that [CI](@/workers/ci/index.md) - is running on your repository, and wait for it to turn green to confirm - everything worked. - -## Setting up your development environment - -This section guides you through setting up a Python development environment -specifically for your worker. - -### Activating the pre-commit hook - -The official template includes code syntax checks such as trailing whitespace, -as well as code linting using [black][black]. Those checks run on GitLab as soon -as you push new code, but it is possible to run those automatically when you -create new commits using the [pre-commit][pre-commit] hook. - -#### To activate the pre-commit hook - -1. Run `pre-commit install`. - -### Setting up the Python virtual environment - -To install Python dependencies that are specific to your worker, and prevent -other dependencies installed on your system from interfering, it is recommended -to use a virtual environment. - -#### To set up a Python virtual environment - -1. Run `mkvirtualenv my_worker`, where `my_worker` is any name of your choice. -2. Install your worker in editable mode: `pip install -e .` - ---- - -<!-- -TODO: You can now start [implementing your worker](@/workers/implement.md). -https://gitlab.com/teklia/arkindex/doc/-/issues/17 ---> - -[base-worker]: https://gitlab.com/teklia/workers/base-worker/ -[pre-commit]: https://pre-commit.com/ -[tox]: https://tox.readthedocs.io/ -[cookiecutter]: https://cookiecutter.readthedocs.io/ -[virtualenvwrapper]: https://virtualenvwrapper.readthedocs.io -[virtualenvwrapper-setup]: https://virtualenvwrapper.readthedocs.io/en/latest/install.html -[black]: https://github.com/psf/black +[extra] +redirect_url = "https://workers.arkindex.org/contents/workers/create/" ++++ diff --git a/content/workers/maintenance.md b/content/workers/maintenance.md index cb1027a15e7ea91135208d77ea638a825959e9e0..8afceeda6594f858d8a93dd8acb6102ea95b0069 100644 --- a/content/workers/maintenance.md +++ b/content/workers/maintenance.md @@ -1,31 +1,7 @@ +++ +template = "redirect_page.html" title = "Maintaining a worker" weight = 60 +[extra] +redirect_url = "https://workers.arkindex.org/contents/workers/maintenance/" +++ - -This page guides you through common tasks applied while maintaining an Arkindex -worker. - -## Updating the template - -To get the changes we make on our [official template][base-worker] to apply to -your worker, you will need to re-apply the template to the worker and resolve -any conflicts that may arise. - -### To update the template - -1. Run the following command: - - ``` - cookiecutter base-worker -f --config-file YOURFILE.yaml --no-input - ``` - - Where `YOURFILE.yaml` is the path of the YAML file you previously created. - -2. Answer `yes` when Cookiecutter requests confirmation to delete and - re-download the template. - -3. Using the Git diff, resolve the conflicts yourself as Cookiecutter will be - overwriting existing files. - -[base-worker]: https://gitlab.com/teklia/workers/base-worker/ diff --git a/content/workers/run-local.md b/content/workers/run-local.md index 53914e0bd321dce680bcad3cb56c6ad353464dcf..b701c44233e39b789706a4380dc01f7dbe6fe4ec 100644 --- a/content/workers/run-local.md +++ b/content/workers/run-local.md @@ -1,111 +1,7 @@ +++ +template = "redirect_page.html" title = "Running your worker locally" weight = 30 +[extra] +redirect_url = "https://workers.arkindex.org/contents/workers/run-local/" +++ - -Once you have implemented a worker, you can run it on some Arkindex elements -on your own machine to test it. - -## Retrieving credentials - -For a worker to run properly, you will need two types of credentials: - -* An API token that gives the worker access to the API -* A worker version ID that lets the worker send results to Arkindex and report - that those come from this particular worker version - -### Retrieving a token - -For the worker to run, you will need an Arkindex authentication token. -You can use your own account's token when testing on your own machine. - -You can retrieve your personal API Token from your [profile page](@/users/auth/index.md#personal-token). - -### Retrieving a worker version ID - -A worker version ID will be required in order to publish results. If your worker -does not create any Arkindex element, classification, transcription, etc., you -may skip this step. - -If this particular worker was already configured on this instance, you can use -its existing worker version ID; otherwise, you will need to ask an Arkindex -administrator to create a fake version ID. - -#### To retrieve a worker version ID from an existing worker - -1. Open a web browser and browse to the Arkindex instance. -2. In the top-right user menu, click on **My repositories**. -3. Click on your worker, listed in the **Workers** column. -4. Rewrite the URL in your browser's address bar, to look like - `https://<arkindex_url>/api/v1/workers/<worker_id>/versions/`: - - * Replace `process` by `api/v1` - * Add a slash character (`/`) at the end - -In the JSON output from this API endpoint, the first value next to `"id"` is -the worker version ID. - -#### To create a fake worker as an administrator - -This action can only be done as an Arkindex administrator with shell access. - -1. In the backend's Docker image, run: - - ``` - arkindex fake_worker_version --name <NAME> --slug <SLUG> --url <URL> - ``` - - Replace `<NAME>`, `<SLUG>` and `<URL>` with the name, slug and GitLab - repository URL, respectively. - -A Git repository is created with a fake OAuth access token. A fake Git revision -is added to this repository, and a fake worker version from a fake worker is -linked to this revision. You should get the following output: - -``` -Created a worker version: 392bd299-bc8f-4ec6-aa3c-e6503ecc7730 -``` - -{% warning() %} -This feature should only be used when a normal worker cannot be created using the Git workflow. -{% end %} - -## Setting credentials - -In a shell you need to set 3 environment variables to transmit your credentials -and Arkindex instance information to the worker: - -- `ARKINDEX_API_URL`: URL that points to the root of the Arkindex instance you are using. -- `ARKINDEX_API_TOKEN`: The API token you retrieved earlier, on your profile page. -- `WORKER_VERSION_ID`: The worker version ID you retrieved earlier. Can be omitted if the worker does -not create new data in Arkindex. - -### To set credentials for your worker - -1. In a shell, run: - - ```sh - export ARKINDEX_API_URL="https://arkindex.teklia.com" - export ARKINDEX_API_TOKEN="YOUR_TOKEN_HERE" - export WORKER_VERSION_ID="xxxxx" - ``` - -{% warning() %} -Do not add these instructions to a script such as `.bashrc`; -this would mean storing credentials in plaintext and can lead to security -breaches. -{% end %} - -## Running your worker - -With the credentials configured, you can now run your worker. -You will need a list of element IDs to run your worker on, which can be found -in the browser's address bar when browsing an element on Arkindex. - -### To run your worker - -1. Activate the Python environment: run `workon X` where `X` is the name of - your Python environment. -2. Run `worker-X`, where `X` is the slug of your worker, followed by - `--element=Y` where `Y` is the ID of an element. You can repeat `--element` - as many times as you need to process multiple elements. diff --git a/content/workers/template-structure.md b/content/workers/template-structure.md index 22d83d06cc0b69a5fd89003d991b556d19b7ccac..6e9fdd5590a127027cc9b1988fbb905af1e8d3ed 100644 --- a/content/workers/template-structure.md +++ b/content/workers/template-structure.md @@ -1,157 +1,7 @@ +++ +template = "redirect_page.html" title = "Base Worker template structure" weight = 90 +[extra] +redirect_url = "https://workers.arkindex.org/contents/workers/template-structure/" +++ - -When building a new worker from our [official template][base-worker], a file -structure gets created for you to ease the burden of setting up a Python -package, a Docker build, with the best development practices: - -<dl> - <dt>.arkindex.yml</dt> - <dd> - -YAML configuration file that allows Arkindex to understand what it should do -with this repository. - -To learn more about this file, see [YAML configuration](@/workers/yaml.md). - - </dd> - <dt>.cookiecutter.yaml</dt> - <dd> - -YAML file that stores the options you defined when creating a new worker. -This file can be reused to [fetch template updates][template-updates]. - - </dd> - <dt>.dockerignore</dt> - <dd> - -Lists which files to exclude from the Docker build context. - -For more information, see the [Docker documentation][dockerignore]. - - </dd> - <dt>.flake8</dt> - <dd> - -Specifies configuration options for the Flake8 linter. - -For more information, see the [Flake8 documentation][flake8]. - - </dd> - <dt>.gitignore</dt> - <dd> - -Lists which files to exclude from Git versioning. - -For more information, see the [Git docs][gitignore]. - - </dd> - <dt>.gitlab-ci.yml</dt> - <dd> - -Configures the GitLab CI jobs and pipelines. - -To learn more about the configuration we provide, see -[GitLab Continuous Integration for workers](@/workers/ci/index.md). - - </dd> - <dt>.isort.cfg</dt> - <dd> - -Configures the automatic Python import sorting rules. - -For more information, see the [isort docs][isort]. - - </dd> - <dt>.pre-commit.config.yaml</dt> - <dd> - -Configures the -[pre-commit hook](@/workers/create.md#activating-the-pre-commit-hook). - - </dd> - <dt>Dockerfile</dt> - <dd> - -Specifies how the Docker image will be built. - -You can change the instructions in this file to update the image to the needs -of your worker, for example to install system dependencies. - - </dd> - <dt>requirements.txt</dt> - <dd> - -Lists the Python dependencies your worker relies on. Those are automatically -installed by the default Dockerfile. - - </dd> - <dt>tox.ini</dt> - <dd> - -Configures the Python unit test runner. - -For more information, see the [tox docs][tox]. - - </dd> - <dt>setup.py</dt> - <dd> - -Configures the worker's Python package. - - </dd> - <dt>VERSION</dt> - <dd> - -Official version number of your worker. Defaults to `0.1.0`. - - </dd> - <dt>ci/build.sh</dt> - <dd> - -Script that gets run by [CI](@/workers/ci/index.md) pipelines -to build the Docker image. - -</dd> - - <dt>tests/test_worker.py</dt> - <dd> - -An example unit test file. - -<!-- -TODO: For more information, see [Writing tests for your worker](@/workers/tests.md). -https://gitlab.com/teklia/arkindex/doc/-/issues/16 ---> - - </dd> - <dt>worker_[slug]/__init__.py</dt> - <dd> - -Declares the folder as a Python package. - - </dd> - <dt>worker_[slug]/worker.py</dt> - <dd> - -The core part of the worker. This is where you can write code that processes -Arkindex elements. - -<!-- TODO: -For more information, see -[Implementing a Machine Learning worker](@/workers/implement.md). -https://gitlab.com/teklia/arkindex/doc/-/issues/17 ---> - - </dd> -</dl> - -[base-worker]: https://gitlab.com/teklia/workers/base-worker/ -[template-updates]: @/workers/maintenance.md#updating-the-template -[dockerignore]: https://docs.docker.com/engine/reference/builder/#dockerignore-file -[flake8]: https://flake8.pycqa.org/en/latest/user/configuration.html -[gitignore]: https://git-scm.com/docs/gitignore -[isort]: https://pycqa.github.io/isort/docs/configuration/config_files/ -[tox]: https://tox.readthedocs.io/en/latest/config.html diff --git a/content/workers/yaml.md b/content/workers/yaml.md index 7c3e1fbb1bbb5967486c0367726947c798fedccd..96bff0d6fe8a33f20acffe8ab34abbd29cee1547 100644 --- a/content/workers/yaml.md +++ b/content/workers/yaml.md @@ -1,306 +1,7 @@ +++ +template = "redirect_page.html" title = "YAML configuration" weight = 80 +[extra] +redirect_url = "https://workers.arkindex.org/contents/workers/yaml/" +++ - -This page is a reference for version 2 of the YAML configuration file for -Git repositories handled by Arkindex. Version 1 is not supported. - -The configuration file is always named `.arkindex.yml` and should be found at -the root of the repository. - -## Required attributes - -The following attributes are required in every `.arkindex.yml` file: - -- `version`: Version of the configuration file in use. An error will occur if the version -number is not set to `2`. -- `workers`: A list of workers attached to the Git repository. - -The `workers` attribute is a list of the following: - -* Paths to a YAML file holding the configuration for a single worker -* Unix-style patterns matching paths to YAML files holding the configuration - for a single worker -* The configuration of a single worker embedded directly into the file - -### Single worker configuration - -The following describes the attributes of a YAML file configuring one worker, or -of the configuration embedded directly in the `.arkindex.yml` file. - -All attributes are optional unless explicitly specified. - -- `name`: Mandatory. Name of the worker, for display purposes. -- `slug`: Mandatory. Slug of this worker. The slug must be unique across the repository and must only hold alphanumerical characters, underscores or dashes. -- `type`: Mandatory. Type of the worker, for display purposes only. Some common values -include: - - * `classifier` - * `recognizer` - * `ner` - * `dla` - * `word-segmenter` - * `paragraph-creator` - -- `docker`: Regroups Docker-related configuration attributes: - - - `build`: Path towards a Dockerfile used to build this worker, relative to the root of the repository. Defaults to `Dockerfile`. - - <!-- - TODO: Make the path relative to the YAML file itself, in the case of a - separate file for a single worker? - https://gitlab.com/teklia/arkindex/tasks/-/issues/95 - --> - - <!-- - TODO: Implement this! - https://gitlab.com/teklia/arkindex/tasks/-/issues/93 - - `image`: Tag of an existing Docker image to use for this worker instead of building a - custom image from a Dockerfile. - --> - - - `command`: Custom command line to be used when launching the Docker container for this Worker. By default, the command specified in the Dockerfile will be used. - - `shm_size`: Size of the available shared memory in `/dev/shm`. The default value is `64M`, but when training machine learning models an increase might be necessary. The given value must be either an integer, or an integer followed by a unit (`b` for bytes, `k` for kilobytes, `m` for megabytes and `g` for gigabytes). If no unit is specified, the default unit is `bytes`. See the [Docker documentation](https://docs.docker.com/engine/reference/run/#runtime-constraints-on-resources). - -- `environment`: Mapping of string keys and string values to define environment variables to be -set when the Docker image runs. - -- `configuration`: Mapping holding any string keys and values that can be later accessed in the -worker's Python code. Can be used to define settings on your own worker, such as -a file's location. - -- `user_configuration`: Mapping defining settings on your worker that can be modified by users. [See below](#setting-up-user-configurable-parameters) for details. - -- `secrets`: List of required secret names for that specific worker. - -For more information, see [Using secrets in workers](@/secrets/workers.md). - -### Setting up user-configurable parameters - -The YAML file can define parameters that users will be able to change when they use this worker in a process on Arkindex. These parameters are listed in a `user_configuration` attribute. - -A parameter is defined using the following settings: -- `title`: mandatory. The parameter's title. -- `type`: mandatory. A value type. The supported types are: - - `int` - - `bool` - - `float` - - `string` - - `enum` - - `list` - - `dict` -- `default`: optional. A default value for the parameter. Must be of the defined parameter `type`. -- `required`: optional. A boolean, defaults to `false`. -- `choices`: optional. Required for and usable with `enum` type parameters only. -- `subtype`: optional. Required for and usable with `list` type parameters only. - -This definition allows for both validation of the input and the display of a form to make configuring workers easy for Arkindex users. - -{{ figure(image="workers/user_configuration/configuration_form.png", height=450, caption="User configuration form on Arkindex") }} - -#### String parameters - -String-type parameters must be defined using a `title` and the `string` `type`. You can also set a `default` value for this parameter, which must be a string, as well as make it a `required` parameter, which prevents users from leaving it blank. - -For example, a string-type parameter can be defined like this: - -```yaml -subfolder_name: - title: Created Subfolder Name - type: string - default: My Neat Subfolder -``` - -Which will result in the following display for the user: - -{{ figure(image="workers/user_configuration/string_config.png", height=300, caption="Example string-type parameter.") }} - -#### Integer parameters - -Integer-type parameters must be defined using a `title` and the `int` `type`. You can also set a `default` value for this parameter, which must be an integer, as well as make it a `required` parameter, which prevents users from leaving it blank. - -For example, an integer-type parameter can be defined like this: - -```yaml -input_size: - title: Input Size - type: int - default: 768 - required: True -``` - -Which will result in the following display for the user: - -{{ figure(image="workers/user_configuration/integer_config.png", height=300, caption="Example integer-type parameter.") }} - -#### Float parameters - -Float-type parameters must be defined using a `title` and the `float` `type`. You can also set a `default` value for this parameter, which must be a float, as well as make it a `required` parameter, which prevents users from leaving it blank. - -For example, a float-type parameter can be defined like this: - -```yaml -wip: - title: Word Insertion Penalty - type: float - required: True -``` - -Which will result in the following display for the user: - -{{ figure(image="workers/user_configuration/float_config.png", height=300, caption="Example float-type parameter.") }} - -#### Boolean parameters - -Boolean-type parameters must be defined using a `title` and the `bool` `type`. You can also set a `default` value for this parameter, which must be a boolean, as well as make it a `required` parameter, which prevents users from leaving it blank. - -In the configuration form, boolean parameters are displayed as toggles. - -For example, a boolean-type parameter can be defined like this: - -```yaml -score: - title: Run Worker in Evaluation Mode - type: bool - default: False -``` - -Which will result in the following display for the user: - -{{ figure(image="workers/user_configuration/bool_config.png", height=300, caption="Example boolean-type parameter.") }} - -#### Enum (choices) parameters - -Enum-type parameters must be defined using a `title`, the `enum` `type` and at least two `choices`. You cannot define an enum-type parameter without `choices`. You can also set a `default` value for this parameter, which must be one of the available `choices`, as well as make it a `required` parameter, which prevents users from leaving it blank. Enum-type parameters should be used when you want to limit the users to a given set of options. - -In the configuration form, enum parameters are displayed as selects. - -For example, an enum-type parameter can be defined like this: - -```yaml -parent_type: - title: Target Parent Element Type - type: enum - default: paragraph - choices: - - paragraph - - text_zone - - page -``` - -Which will result in the following display for the user: - -{{ figure(image="workers/user_configuration/enum_config.png", height=300, caption="Example enum-type parameter.") }} - -#### List parameters - -List-type parameters must be defined using a `title`, the `list` `type` and a `subtype` for the elements inside the list. You can also set a `default` value for this parameter, which must be a list containing elements of the given `subtype`, as well as make it a `required` parameter, which prevents users from leaving it blank. - -The allowed `subtype`s are `int`, `float` and `string`. - -In the configuration form, list parameters are displayed as rows of input fields. - -For example, a list-type parameter can be defined like this: - -```yaml -a_list: - title: A List of Values - type: list - subtype: int - default: [4, 3, 12] -``` - -Which will result in the following display for the user: - -{{ figure(image="workers/user_configuration/list_config.png", height=360, caption="Example list-type parameter.") }} - -#### Dictionary parameters - -Dictionary-type parameters must be defined using a `title` and the `dict` `type`. You can also set a `default` value for this parameter, which must be a dictionary, as well as make it a `required` parameter, which prevents users from leaving it blank. You can use dictionary parameters for example to specify a correspondence between the classes that are predicted by a worker and the elements that are created on Arkindex from these predictions. - -Dictionary-type parameters only accept strings as values. - -In the configuration form, dictionary parameters are displayed as a table with one column for keys and one column for values. - -For example, a dictionary-type parameter can be defined like this: - -```yaml -classes: - title: Output Classes to Elements Correspondence - type: dict - default: - a: page - b: text_line -``` - -Which will result in the following display for the user: - -{{ figure(image="workers/user_configuration/dict_config.png", height=300, caption="Example dictionary-type parameter.") }} - -#### Example user_configuration - -```yaml -user_configuration: - vertical_padding: - type: int - default: 0 - title: Vertical Padding - element_base_name: - type: string - required: true - title: Element Base Name - create_confidence_metadata: - type: bool - default: false - title: Create confidence metadata on elements - some_other_parameter: - type: enum - required: true - default: 23 - choices: - - 12 - - 23 - - 56 - title: Another Parameter -``` - -#### Fallback to free JSON input - -If you have defined user-configurable parameters using these specifications, Arkindex users can choose between using the form or the free JSON input field by toggling the **JSON** toggle. If there are unsupported parameter types in the defined `user_configuration`, the frontend will automatically fall back to the free JSON input field. The same is true if you have not defined user-configurable parameters using these specifications. - -### Example configuration - -```yaml ---- -version: 2 -workers: - # Path to a single YAML file - - path/to/worker.yml - # Pattern matching any YAML file in the configuration folder - # or in its sub-directories - - configuration/**/*.yml - # Configuration embedded directly into this file - - name: Book of hours - slug: book_of_hours - type: classifier - docker: - build: project/Dockerfile - image: hub.docker.com/project/image:tag - command: python mysuperscript.py --blabla - shm_size: 128m - environment: - TOKEN: deadBeefToken - configuration: - model: path/to/model - anyKey: anyValue - classes: [X, Y, Z] - user_configuration: - vertical_padding: - type: int - default: 0 - title: Vertical Padding - secrets: - - path/to/secret.json -``` diff --git a/templates/redirect_page.html b/templates/redirect_page.html new file mode 100644 index 0000000000000000000000000000000000000000..6ba6196fa4e82fcacda8cb58afd73658b911103c --- /dev/null +++ b/templates/redirect_page.html @@ -0,0 +1,13 @@ +{# Template to redirect to an external URL. + Copied from the original Zola redirection template: + https://github.com/getzola/zola/blob/96db5231e74599815df381fbf90d499c2d5b3814/components/templates/src/builtins/internal/alias.html + The `url` variable is computed by Zola as an internal link, always relative to the site's base URL, + which does not allow for external URLs. + This template uses a custom property in `[extra]` to instead allow redirecting anywhere. #} + <!doctype html> + <meta charset="utf-8"> + <link rel="canonical" href="{{ page.extra.redirect_url | safe }}"> + <meta http-equiv="refresh" content="0; url={{ page.extra.redirect_url | safe }}"> + <title>Redirect</title> + <p><a href="{{ page.extra.redirect_url | safe }}">Click here</a> to be redirected.</p> + \ No newline at end of file diff --git a/templates/redirect_section.html b/templates/redirect_section.html new file mode 100644 index 0000000000000000000000000000000000000000..1d30172cf944fa52df60c68e1f7ff00ceb982fa9 --- /dev/null +++ b/templates/redirect_section.html @@ -0,0 +1,13 @@ +{# Template to redirect to an external URL. + Copied from the original Zola redirection template: + https://github.com/getzola/zola/blob/96db5231e74599815df381fbf90d499c2d5b3814/components/templates/src/builtins/internal/alias.html + The `url` variable is computed by Zola as an internal link, always relative to the site's base URL, + which does not allow for external URLs. + This template uses a custom property in `[extra]` to instead allow redirecting anywhere. #} + <!doctype html> + <meta charset="utf-8"> + <link rel="canonical" href="{{ section.extra.redirect_url | safe }}"> + <meta http-equiv="refresh" content="0; url={{ section.extra.redirect_url | safe }}"> + <title>Redirect</title> + <p><a href="{{ section.extra.redirect_url | safe }}">Click here</a> to be redirected.</p> + \ No newline at end of file