development.md

# Development

DAN uses different tools during its development.

## Linter

Code syntax is analyzed before submitting the code.

To run the linter tools suite you may use [pre-commit](https://pre-commit.com).

```shell
pip install pre-commit
pre-commit run -a
```

## Tests

### Unit tests

Tests are executed with [tox](https://tox.wiki) using [pytest](https://pytest.org).

```shell
pip install tox
tox
```

To recreate tox virtual environment (e.g. a dependencies update), you may run `tox -r`.

Run a single test module: `tox -- <test_path>`
Run a single test: `tox -- <test_path>::<test_function>`

The tests use a large file stored via [Git-LFS](https://docs.gitlab.com/ee/topics/git/lfs/). Make sure to run `git-lfs pull` before running them.

### Commands

As unit tests do not test *everything*, it is sometimes necessary to use DAN commands directly to test developments.

#### Training command

The library already has all the documents needed to run the [training command](../usage/train/index.md) on a minimalist dataset. You can use the configuration available at `configs/tests.json`. It is already populated with the parameters used in the unit tests.

```shell
teklia-dan train --config configs/tests.json
```

#### Predict command

The library already has all the documents needed to run the [predict command](../usage/predict/index.md) with a minimalist model. In the `tests/data/prediction` directory, you can run the following command and add any extra parameters you need:

```shell
teklia-dan predict \
    --image-dir images/ \
    --image-extension png \
    --model . \
    --output /tmp/dan-predict
```

#### Evaluation command

The library already has all the documents needed to run the [evaluation command](../usage/evaluate/index.md) on a minimalist dataset. You can use the configuration available at `configs/eval.json`. It is already populated with the parameters used in the unit tests.

```shell
teklia-dan evaluate --config configs/eval.json
```

#### Convert command

If you want to evaluate a NER models with you own scripts, you can convert DAN's predictions in [BIO](<https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)>) format, using the [convert command](../usage/convert/index.md).

```shell
teklia-dan convert /tmp/dan-predict --tokens tokens.yml --output /tmp/dan-convert
```

## Documentation

This documentation uses [Sphinx](http://www.sphinx-doc.org/) and was generated using [MkDocs](https://mkdocs.org/) and [mkdocstrings](https://mkdocstrings.github.io/).

### Setup

Install the needed dependencies through:

```shell
# In a clone of the Git repository
pip install -r doc-requirements.txt
```

Build the documentation using `mkdocs serve -v`. You can then write in [Markdown](https://www.markdownguide.org/) in the relevant `docs/*.md` files, and see live output on http://localhost:8000.

### Linter

This documentation is subject to linting using:

- `doc8`, the linting rules applied can be found on [its documentation][1],
- `mdformat`, the formatting rules applied can be found on [its documentation][2].

[1]: https://doc8.readthedocs.io/en/latest/readme.html#usage
[2]: https://mdformat.readthedocs.io/en/stable/users/style.html