Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • workers/base-worker
1 result
Show changes
Showing
with 652 additions and 109 deletions
doc8==0.11.1
black==22.12.0
doc8==1.0.0
mkdocs==1.4.2
mkdocs-material==8.5.11
mkdocstrings==0.19.1
mkdocstrings-python==0.8.2
recommonmark==0.7.1
Sphinx==5.1.1
sphinx-rtd-theme==1.0.0
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
docs/assets/favicon.png

2.54 KiB

docs/assets/logo.png

5.55 KiB

# -*- coding: utf-8 -*-
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
import os
import sys
from recommonmark.transform import AutoStructify
sys.path.insert(0, os.path.abspath(".."))
# -- Project information -----------------------------------------------------
project = "Arkindex Base Worker"
copyright = "2022, Teklia"
author = "Teklia"
# The full version, including alpha/beta/rc tags
with open("../VERSION") as f:
release = f.read().strip()
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.autosummary",
"sphinx.ext.coverage",
"sphinx.ext.viewcode",
"recommonmark",
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "README.md"]
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = "sphinx"
source_suffix = {
".rst": "restructuredtext",
".md": "markdown",
}
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "sphinx_rtd_theme"
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]
# -- Extension configuration -------------------------------------------------
autodoc_default_options = {
"members": True,
"undoc-members": True,
"member-order": "bysource",
}
def setup(app):
app.add_config_value(
"recommonmark_config", {"auto_toc_tree_section": "Contents"}, True
)
app.add_transform(AutoStructify)
# Secrets
Secrets are text payloads shared securely between the Arkindex instance and any worker. It is generally used to store sensitive values that may give Arkindex users access to any resources that should be private, for example because they cost money or are proprietary.
For more information about secrets, please visit [the Arkindex documentation](https://doc.arkindex.org/secrets/).
# Usage
## Accessing secrets in the API
Since Arkindex 0.14.2, an API endpoint is available to retrieve secrets called
[RetrieveSecret][api]. The endpoint cannot be accessed by regular users, but
machine learning workers and administrators can use it.
## Declaring secrets in workers
Declaring a secret in your worker allows our [base worker package][base-worker]
to retrieve the secret for you, and causes it to fail when the secret is
missing from the Arkindex instance.
### To declare a secret
1. Add the secret's name to the `secrets` section of a worker in the
`.arkindex.yml` file:
```yaml
---
version: 2
type: worker
workers:
- slug: my_worker
name: My nice worker
docker:
build: Dockerfile
configuration:
threshold: 21.3
# Declare your secrets as below, only specifying their name
secrets:
- project/tool/credentials.json
```
For more information on the `.arkindex.yml` file,
see [YAML configuration](../workers/yaml.md).
## Accessing secrets in Python code
Declared secrets will be made available to `Worker` classes as the
`self.secrets` attribute, a Python `dict` mapping secret names to
unencrypted secret content.
### To access a secret in Python code
1. Anywhere in the code, use `self.secrets["my_secret_name"]`,
where `my_secret_name` is the name of the secret.
[base-worker]: http://pypi.org/pypi/arkindex-base-worker
[api]: https://arkindex.teklia.com/api-docs/#tag/ponos/operation/RetrieveSecret
# GitLab CI for workers
This page describes how continuous integration (CI) is used in workers created
using the `base-worker` template.
For more information on creating workers, see
[Setting up a worker](../create).
## Default template
When creating a worker with our official template, a `.gitlab-ci.yml` file has
been included with a few actions that will run on every push you make.
The CI jobs will run in the following order:
<img style="display:block;float:none;margin-left:auto;margin-right:auto;" src="./pipeline.svg" alt="CI pipeline execution order">
## Git Flow
At Teklia, we use a simple version of [Git Flow][gitflow]:
- The `default` branch should always have validated code and should be deployable
in production at any time.
- Developments should happen in branches, with merge requests to enable code
review and Gitlab CI pipelines.
- Project maintainers should use Git tags to create official releases, by
updating the `VERSION` file and using the same version string as the tag name.
This process is reflected the template's `.gitlab-ci.yml` file.
## Linting
The `lint` job uses [pre-commit] to run source code linters on your
project and validate various rules:
- Checking your Python code is PEP8 compliant
- Auto-formatting your Python code using [black]
- Sort your Python imports
- Check you don't have any trailing white space
- Check your YAML files are well formatted
- Fix some common spelling errors
You can set up pre-commit to run locally too; see
[Activating the pre-commit hook](../create#activating-the-pre-commit-hook).
## Testing
The `test` job uses [tox] and [pytest] modules to run written unit
tests for your repository and avoid any kind of code regression.
Any unit test you have added to your project will be executed on each git push,
allowing you to check the validity of your code before merging it.
Unit tests allow you to prevent regressions in your code when making changes,
and find bugs before they make their way into production.
<!-- TODO:
For more information, see [Writing unit tests for your worker](../tests).
-->
## Building
When the `test` & `lint` jobs run successfully, the `docker` job runs. It will
try to build a docker image from your `Dockerfile`. This will check that your
`Dockerfile` is valid and builds an image successfully.
This build step is only used as a check, as Arkindex builds Docker images on
its own.
## Generating release notes
When the `docker` job is successful and the CI pipeline is running for a Git
tag, the `release-notes` job runs. It will list all the commits since the
previous tag and aggregate them to publish release notes on the GitLab project.
We provide an [open source docker image](https://gitlab.com/teklia/devops/) to build these release notes,
but you'll need to provide your own Gitlab access token so that the task can
publish release notes on your own repository.
You can generate an access token on the Gitlab's page [User Settings > Access Tokens](https://gitlab.com/-/profile/personal_access_tokens), with `api` scope.
The token must then be set as a CI Variable on your Gitlab project:
1. go to your project settings,
1. go to section **CI / CD**
1. click on `Expand` in the **Variables** section
1. add a new variable named `DEVOPS_GITLAB_TOKEN` whose value is your token
[black]: https://github.com/psf/black
[gitflow]: https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow
[pre-commit]: https://pre-commit.com/
[pytest]: https://docs.pytest.org/
[tox]: https://tox.readthedocs.io/
<svg id="mermaid-1611246541133" width="100%" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="252" style="max-width: 152.25001525878906px;" viewBox="0 0 152.25001525878906 252"><style>#mermaid-1611246541133{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#000000;}#mermaid-1611246541133 .error-icon{fill:#552222;}#mermaid-1611246541133 .error-text{fill:#552222;stroke:#552222;}#mermaid-1611246541133 .edge-thickness-normal{stroke-width:2px;}#mermaid-1611246541133 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-1611246541133 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-1611246541133 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-1611246541133 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-1611246541133 .marker{fill:#666;}#mermaid-1611246541133 .marker.cross{stroke:#666;}#mermaid-1611246541133 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-1611246541133 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#000000;}#mermaid-1611246541133 .label text{fill:#000000;}#mermaid-1611246541133 .node rect,#mermaid-1611246541133 .node circle,#mermaid-1611246541133 .node ellipse,#mermaid-1611246541133 .node polygon,#mermaid-1611246541133 .node path{fill:#eee;stroke:#999;stroke-width:1px;}#mermaid-1611246541133 .node .label{text-align:center;}#mermaid-1611246541133 .node.clickable{cursor:pointer;}#mermaid-1611246541133 .arrowheadPath{fill:#333333;}#mermaid-1611246541133 .edgePath .path{stroke:#666;stroke-width:1.5px;}#mermaid-1611246541133 .flowchart-link{stroke:#666;fill:none;}#mermaid-1611246541133 .edgeLabel{background-color:white;text-align:center;}#mermaid-1611246541133 .edgeLabel rect{opacity:0.5;background-color:white;fill:white;}#mermaid-1611246541133 .cluster rect{fill:hsl(210,66.6666666667%,95%);stroke:#26a;stroke-width:1px;}#mermaid-1611246541133 .cluster text{fill:#333;}#mermaid-1611246541133 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(-160,0%,93.3333333333%);border:1px solid #26a;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-1611246541133:root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;}#mermaid-1611246541133 flowchart{fill:apa;}</style><g><g class="output"><g class="clusters"></g><g class="edgePaths"><g class="edgePath LS-test LE-docker" style="opacity: 1;" id="L-test-docker"><path class="path" d="M30.900001525878906,47L30.900001525878906,72L57.058711534135796,97" marker-end="url(#arrowhead283)" style="fill:none"></path><defs><marker id="arrowhead283" viewBox="0 0 10 10" refX="9" refY="5" markerUnits="strokeWidth" markerWidth="8" markerHeight="6" orient="auto"><path d="M 0 0 L 10 5 L 0 10 z" class="arrowheadPath" style="stroke-width: 1px; stroke-dasharray: 1px, 0px;"></path></marker></defs></g><g class="edgePath LS-lint LE-docker" style="opacity: 1;" id="L-lint-docker"><path class="path" d="M124.02500915527344,47L124.02500915527344,72L97.86629914701655,97" marker-end="url(#arrowhead284)" style="fill:none"></path><defs><marker id="arrowhead284" viewBox="0 0 10 10" refX="9" refY="5" markerUnits="strokeWidth" markerWidth="8" markerHeight="6" orient="auto"><path d="M 0 0 L 10 5 L 0 10 z" class="arrowheadPath" style="stroke-width: 1px; stroke-dasharray: 1px, 0px;"></path></marker></defs></g><g class="edgePath LS-docker LE-release-notes" style="opacity: 1;" id="L-docker-release-notes"><path class="path" d="M77.46250534057617,136L77.46250534057617,170.5L77.46250534057617,205" marker-end="url(#arrowhead285)" style="fill:none"></path><defs><marker id="arrowhead285" viewBox="0 0 10 10" refX="9" refY="5" markerUnits="strokeWidth" markerWidth="8" markerHeight="6" orient="auto"><path d="M 0 0 L 10 5 L 0 10 z" class="arrowheadPath" style="stroke-width: 1px; stroke-dasharray: 1px, 0px;"></path></marker></defs></g></g><g class="edgeLabels"><g class="edgeLabel" style="opacity: 1;" transform=""><g transform="translate(0,0)" class="label"><rect rx="0" ry="0" width="0" height="0"></rect><foreignObject width="0" height="0"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; white-space: nowrap;"><span id="L-L-test-docker" class="edgeLabel L-LS-test' L-LE-docker"></span></div></foreignObject></g></g><g class="edgeLabel" style="opacity: 1;" transform=""><g transform="translate(0,0)" class="label"><rect rx="0" ry="0" width="0" height="0"></rect><foreignObject width="0" height="0"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; white-space: nowrap;"><span id="L-L-lint-docker" class="edgeLabel L-LS-lint' L-LE-docker"></span></div></foreignObject></g></g><g class="edgeLabel" style="opacity: 1;" transform="translate(77.46250534057617,170.5)"><g transform="translate(-22.25,-9.5)" class="label"><rect rx="0" ry="0" width="44.5" height="19"></rect><foreignObject width="44.5" height="19"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; white-space: nowrap;"><span id="L-L-docker-release-notes" class="edgeLabel L-LS-docker' L-LE-release-notes">on tag</span></div></foreignObject></g></g></g><g class="nodes"><g class="node default" style="opacity: 1;" id="flowchart-test-254" transform="translate(30.900001525878906,27.5)"><rect rx="0" ry="0" x="-22.900001525878906" y="-19.5" width="45.80000305175781" height="39" class="label-container"></rect><g class="label" transform="translate(0,0)"><g transform="translate(-12.900001525878906,-9.5)"><foreignObject width="25.800003051757812" height="19"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; white-space: nowrap;">test</div></foreignObject></g></g></g><g class="node default" style="opacity: 1;" id="flowchart-docker-255" transform="translate(77.46250534057617,116.5)"><rect rx="0" ry="0" x="-34.01667022705078" y="-19.5" width="68.03334045410156" height="39" class="label-container"></rect><g class="label" transform="translate(0,0)"><g transform="translate(-24.01667022705078,-9.5)"><foreignObject width="48.03334045410156" height="19"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; white-space: nowrap;">docker</div></foreignObject></g></g></g><g class="node default" style="opacity: 1;" id="flowchart-lint-256" transform="translate(124.02500915527344,27.5)"><rect rx="0" ry="0" x="-20.225006103515625" y="-19.5" width="40.45001220703125" height="39" class="label-container"></rect><g class="label" transform="translate(0,0)"><g transform="translate(-10.225006103515625,-9.5)"><foreignObject width="20.45001220703125" height="19"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; white-space: nowrap;">lint</div></foreignObject></g></g></g><g class="node default" style="opacity: 1;" id="flowchart-release-notes-259" transform="translate(77.46250534057617,224.5)"><rect rx="0" ry="0" x="-58.48333740234375" y="-19.5" width="116.9666748046875" height="39" class="label-container"></rect><g class="label" transform="translate(0,0)"><g transform="translate(-48.48333740234375,-9.5)"><foreignObject width="96.9666748046875" height="19"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; white-space: nowrap;">release-notes</div></foreignObject></g></g></g></g></g></g></svg>
\ No newline at end of file
# Setting up a new worker
This page will guide you through creating a new Arkindex worker locally and
preparing a development environment.
This guide assumes you are using Ubuntu 20.04 or later and have root access.
## Preparing your environment
This section will guide you through preparing your system to create a new
Arkindex worker from our [official template][base-worker].
### Installing system dependencies
To retrieve the Arkindex worker template, you will need to have both Git and
SSH. Git is a version control system that you will later use to manage multiple
versions of your worker. SSH allows secure connections to remote machines, and
will be used in our case to retrieve the template from a Git server.
#### To install system dependencies
1. Run the following command:
```
sudo apt install git ssh
```
### Checking your version of Python
Our Arkindex worker template requires Python 3.6 or later. Checking if a
compatible version of Python is installed avoids further issues in the setup
process.
#### To check your version of Python
1. Run the following command: `python3 --version`
This command will have an output similar to the following:
```
Python 3.6.9
```
### Installing Python
If you were unable to check your Python version as stated above because
`python3` was not found, you will need to install Python 3 on your system.
#### To install Python on Ubuntu
1. Run the following command:
```
sudo apt install python3 python3-pip python3-virtualenv
```
1. Check your Python version again, as instructed in the previous section.
### Installing Python dependencies
To bootstrap a new Arkindex worker, some Python dependencies will be required:
- [pre-commit] will be used to automatically check the
syntax of your source code.
- [tox] will be used to run unit tests.
<!--
TODO: Link to [unit tests](tests)
-->
- [cookiecutter] will be used to bootstrap the project.
- [virtualenvwrapper] will be used to manage Python virtual
environments.
#### To install Python dependencies
1. Run the following command:
```
pip3 install pre-commit tox cookiecutter virtualenvwrapper
```
1. Follow the
[official virtualenvwrapper setup instructions][virtualenvwrapper-setup]
until you are able to run `workon`.
`workon` should have an empty output, as no Python virtual environments have
been set up yet.
## Creating the project
This section will guide you through creating a new worker from our official
template and making it available on a GitLab instance.
### Creating a GitLab project
For a worker to be accessible from an Arkindex instance, it needs to be sent
to a repository on a GitLab project. A GitLab project will also allow you to
manage different versions of a worker and run
[automated checks](ci/index) on your code.
#### To create a GitLab project
1. Open the **New project** form [on GitLab.com](https://gitlab.com/projects/new)
or on another GitLab instance
1. Enter your worker name as the **Project name**
1. Define a **Project slug** related to your worker, e.g.:
- `tesseract` for a Tesseract worker
- `opencv-foo` for an OpenCV worker related to project Foo
1. Click on the **Create project** button
### Bootstrapping the project
This section guides you through using our [official template][base-worker]
to get a basic structure for your worker.
#### To bootstrap the project
1. Open a terminal and go to a folder in which you will want your worker to be.
1. Enter this command and fill in the required information:
```
cookiecutter git@gitlab.com:teklia/workers/base-worker.git
```
Cookiecutter will ask you for several options:
`slug`
: A slug for the worker. This should use lowercase alphanumeric characters or
underscores to meet the code formatting requirements that the template
automatically enforces via [black].
`name`
: A name for the worker, purely used for display purposes.
`description`
: A general description of the worker. This will be used to initialize the `README.md` of your repository as well as the `help` command output.
`worker_type`
: An arbitrary string purely used for display purposes.
For example:
- `recognizer`,
- `classifier`,
- `dla`,
- `entity-recognizer`, etc.
`author`
: A name for the worker's author. Usually your first and last name.
`email`
: Your e-mail address. This will be used to contact you if any administrative need arise
### Pushing to GitLab
This section guides you through pushing the newly created worker from your
system to the GitLab project's repository.
This section assumes you have Maintainer or Owner access to the GitLab project.
#### To push to GitLab
1. Enter the newly created directory, starting in `worker-` and ending with your
worker's slug.
1. Add your GitLab project as a Git remote:
```
git remote add origin git@my-gitlab-instance.com:path/to/worker.git
```
You will need to use your own instance's URL and the path to your own
project. For example, a project named `hello` in the `teklia` group
on `gitlab.com` will use the following command:
```
git remote add origin git@gitlab.com:teklia/hello.git
```
1. Push the new branch to GitLab:
```
git push --set-upstream origin master
```
If you want to push a different branch, you first need to create it. For example,
if you want to push to a new branch named `bootstrap`, you will use:
```
git checkout -b bootstrap
git push --set-upstream origin bootstrap
```
1. Open your GitLab project in a browser.
1. Click on the blue icon indicating that [CI](ci/index)
is running on your repository, and wait for it to turn green to confirm
everything worked.
## Setting up your development environment
This section guides you through setting up a Python development environment
specifically for your worker.
### Activating the pre-commit hook
The official template includes code syntax checks such as trailing whitespace,
as well as code linting using [black]. Those checks run on GitLab as soon
as you push new code, but it is possible to run those automatically when you
create new commits using the [pre-commit] hook.
#### To activate the pre-commit hook
1. Run `pre-commit install`.
### Setting up the Python virtual environment
To install Python dependencies that are specific to your worker, and prevent
other dependencies installed on your system from interfering, it is recommended
to use a virtual environment.
#### To set up a Python virtual environment
1. Run `mkvirtualenv my_worker`, where `my_worker` is any name of your choice.
1. Install your worker in editable mode: `pip install -e .`
[base-worker]: https://gitlab.com/teklia/workers/base-worker/
[black]: https://github.com/psf/black
[cookiecutter]: https://cookiecutter.readthedocs.io/
[pre-commit]: https://pre-commit.com/
[tox]: https://tox.readthedocs.io/
[virtualenvwrapper]: https://virtualenvwrapper.readthedocs.io
[virtualenvwrapper-setup]: https://virtualenvwrapper.readthedocs.io/en/latest/install.html
# Workers
Arkindex has a powerful system to run asynchronous tasks. Those are based on
Docker images, and can do about anything (ML processing, but also to import
data into Arkindex, export from Arkindex to another system or file format...)
This section consists of the following guides:
## Contents
* [Setting up a new worker](create)
* [Running your worker locally](run-local)
* [Maintaining a worker](maintenance)
* [GitLab CI for workers](ci/index)
* [YAML configuration](yaml)
* [Template structure](template-structure)
# Maintaining a worker
This page guides you through common tasks applied while maintaining an Arkindex
worker.
## Updating the template
To get the changes we make on our [official template][base-worker] to apply to
your worker, you will need to re-apply the template to the worker and resolve
any conflicts that may arise.
### To update the template
1. Run the following command:
```
cookiecutter base-worker -f --config-file YOURFILE.yaml --no-input
```
Where `YOURFILE.yaml` is the path of the YAML file you previously created.
1. Answer `yes` when Cookiecutter requests confirmation to delete and
re-download the template.
1. Using the Git diff, resolve the conflicts yourself as Cookiecutter will be
overwriting existing files.
[base-worker]: https://gitlab.com/teklia/workers/base-worker/
# Running your worker locally
Once you have implemented a worker, you can run it on some Arkindex elements
on your own machine to test it.
!!! warning
This section has been deprecated as of the latest version of base-worker.
## Retrieving credentials
For a worker to run properly, you will need two types of credentials:
- An API token that gives the worker access to the API
- A worker version ID that lets the worker send results to Arkindex and report
that those come from this particular worker version
### Retrieving a token
For the worker to run, you will need an Arkindex authentication token.
You can use your own account's token when testing on your own machine.
You can retrieve your personal API Token from your [profile page](https://doc.arkindex.org/users/auth/index.md#personal-token).
### Retrieving a worker version ID
A worker version ID will be required in order to publish results. If your worker
does not create any Arkindex element, classification, transcription, etc., you
may skip this step.
If this particular worker was already configured on this instance, you can use
its existing worker version ID; otherwise, you will need to ask an Arkindex
administrator to create a fake version ID.
#### To retrieve a worker version ID from an existing worker
1. Open a web browser and browse to the Arkindex instance.
2. In the top-right user menu, click on **My repositories**.
3. Click on your worker, listed in the **Workers** column.
4. Rewrite the URL in your browser's address bar, to look like
`https://<arkindex_url>/api/v1/workers/<worker_id>/versions/`
- Replace `process` by `api/v1`
- Add a slash character (`/`) at the end
In the JSON output from this API endpoint, the first value next to `"id"` is
the worker version ID.
#### To create a fake worker as an administrator
This action can only be done as an Arkindex administrator with shell access.
1. In the backend's Docker image, run:
```
arkindex fake_worker_version --name <NAME> --slug <SLUG> --url <URL>
```
Replace `<NAME>`, `<SLUG>` and `<URL>` with the name, slug and GitLab
repository URL, respectively.
A Git repository is created with a fake OAuth access token. A fake Git revision
is added to this repository, and a fake worker version from a fake worker is
linked to this revision. You should get the following output:
```
Created a worker version: 392bd299-bc8f-4ec6-aa3c-e6503ecc7730
```
!!! warning
This feature should only be used when a normal worker cannot be created using the Git workflow.
## Setting credentials
In a shell you need to set 3 environment variables to transmit your credentials
and Arkindex instance information to the worker:
`ARKINDEX_API_URL`
: URL that points to the root of the Arkindex instance you are using.
`ARKINDEX_API_TOKEN`
: The API token you retrieved earlier, on your profile page.
`WORKER_VERSION_ID`
: The worker version ID you retrieved earlier. Can be omitted if the worker does
not create new data in Arkindex.
### To set credentials for your worker
1. In a shell, run:
```sh
export ARKINDEX_API_URL="https://arkindex.teklia.com"
export ARKINDEX_API_TOKEN="YOUR_TOKEN_HERE"
export WORKER_VERSION_ID="xxxxx"
```
!!! warning
Do not add these instructions to a script such as `.bashrc`;
this would mean storing credentials in plaintext and can lead to security breaches
## Running your worker
With the credentials configured, you can now run your worker.
You will need a list of element IDs to run your worker on, which can be found
in the browser's address bar when browsing an element on Arkindex.
### To run your worker
1. Activate the Python environment: run `workon X` where `X` is the name of
your Python environment.
2. Run `worker-X`, where `X` is the slug of your worker, followed by
`--element=Y` where `Y` is the ID of an element. You can repeat `--element`
as many times as you need to process multiple elements.
# Template structure
When building a new worker from our [official template][base-worker], a file
structure gets created for you to ease the burden of setting up a Python
package, a Docker build, with the best development practices:
`.arkindex.yml`
: YAML configuration file that allows Arkindex to understand what it should do
with this repository.
To learn more about this file, see [YAML configuration](yaml.md).
`.cookiecutter.yaml`
: YAML file that stores the options you defined when creating a new worker.
This file can be reused to [fetch template updates][template-updates].
`.dockerignore`
: Lists which files to exclude from the Docker build context.
For more information, see the [Docker documentation][dockerignore].
`.flake8`
: Specifies configuration options for the Flake8 linter.
For more information, see the [Flake8 documentation][flake8].
`.gitignore`
: Lists which files to exclude from Git versioning.
For more information, see the [Git docs][gitignore].
`.gitlab-ci.yml`
: Configures the GitLab CI jobs and pipelines.
To learn more about the configuration we provide, see
[GitLab Continuous Integration for workers](ci/index).
`.isort.cfg`
: Configures the automatic Python import sorting rules.
For more information, see the [isort docs][isort].
`.pre-commit.config.yaml`
: Configures the [pre-commit hook](create#activating-the-pre-commit-hook).
`Dockerfile`
: Specifies how the Docker image will be built.
You can change the instructions in this file to update the image to the needs
of your worker, for example to install system dependencies.
`requirements.txt`
: Lists the Python dependencies your worker relies on. Those are automatically
installed by the default Dockerfile.
`tox.ini`
: Configures the Python unit test runner.
For more information, see the [tox docs][tox].
`setup.py`
: Configures the worker's Python package.
`VERSION`
: Official version number of your worker. Defaults to `0.1.0`.
`ci/build.sh`
: Script that gets run by [CI](ci/index) pipelines
to build the Docker image.
`tests/test_worker.py`
: An example unit test file.
<!--
TODO: For more information, see [Writing tests for your worker](tests).
-->
`worker_[slug]/__init__.py`
: Declares the folder as a Python package.
`worker_[slug]/worker.py`
: The core part of the worker. This is where you can write code that processes
Arkindex elements.
<!-- TODO:
For more information, see
[Implementing a Machine Learning worker](implement.md).
-->
[base-worker]: https://gitlab.com/teklia/workers/base-worker/
[dockerignore]: https://docs.docker.com/engine/reference/builder/#dockerignore-file
[flake8]: https://flake8.pycqa.org/en/latest/user/configuration.html
[gitignore]: https://git-scm.com/docs/gitignore
[isort]: https://pycqa.github.io/isort/docs/configuration/config_files/
[template-updates]: maintenance#updating-the-template
[tox]: https://tox.readthedocs.io/en/latest/config.html
docs/contents/workers/user_configuration/bool_config.png

28 KiB

docs/contents/workers/user_configuration/configuration_form.png

124 KiB

docs/contents/workers/user_configuration/dict_config.png

34.8 KiB

docs/contents/workers/user_configuration/enum_config.png

28.9 KiB

docs/contents/workers/user_configuration/float_config.png

29.1 KiB

docs/contents/workers/user_configuration/integer_config.png

26.4 KiB