Skip to content
Snippets Groups Projects

Add optional confidence

Merged Yoann Schneider requested to merge add-optional-confidence into master
All threads resolved!
1 file
+ 21
19
Compare changes
  • Side-by-side
  • Inline
+ 21
19
Backend for Historical Manuscripts Indexing
===========================================
[![pipeline status](https://gitlab.com/arkindex/backend/badges/master/pipeline.svg)](https://gitlab.com/arkindex/backend/commits/master)
@@ -31,6 +32,7 @@ arkindex/manage.py createsuperuser
For development purposes, you can customize the Arkindex settings by adding a YAML file as `arkindex/config.yml`. This file is not tracked by Git; if it exists, any configuration directive set in this file will be used for exposed settings from `settings.py`. You can view the full list of settings [on the wiki](https://wiki.vpn/en/arkindex/deploy/configuration).
Another mean to customize your Arkindex instance is to add a Python file in `arkindex/project/local_settings.py`. Here you are not limited to exposed settings, and can customize any setting, or even load Python dependencies at boot time. This is not recommended, as your customization may not be available to real-world Arkindex instances.
### ImageMagick setup
@@ -90,23 +92,23 @@ In this early version groups do not define any right yet.
At the root of the repository is a Makefile that provides commands for common operations:
- `make` or `make all`: Clean and build;
- `make base`: Create and push the `arkindex-base` Docker image that is used to build the `arkindex-app` image;
- `make clean`: Cleanup the Python package build and cache files;
- `make build`: Build the arkindex Python package and recreate the `arkindex-app:latest` without pushing to the GitLab container registry;
- `make test-fixtures`: Create the unit tests fixtures on a temporary PostgreSQL database and save them to the `data.json` file used by most Django unit tests.
* `make` or `make all`: Clean and build;
* `make base`: Create and push the `arkindex-base` Docker image that is used to build the `arkindex-app` image;
* `make clean`: Cleanup the Python package build and cache files;
* `make build`: Build the arkindex Python package and recreate the `arkindex-app:latest` without pushing to the GitLab container registry;
* `make test-fixtures`: Create the unit tests fixtures on a temporary PostgreSQL database and save them to the `data.json` file used by most Django unit tests.
### Django commands
Aside from the usual Django commands, some custom commands are available via `manage.py`:
- `build_fixtures`: Create a set of database elements designed for use by unit tests in a fixture (see `make test-fixtures`);
- `from_csv`: Import manifests and index files from a CSV list;
- `import_annotations`: Import index files from a folder into a specific volume;
- `import_acts`: Import XML surface files and CSV act files;
- `delete_corpus`: Delete a big corpus using a Ponos task;
- `reindex`: Run asynchronous tasks on the Celery worker to reindex transcriptions in ElasticSearch;
- `telegraf`: A special command with InfluxDB-compatible output for Grafana statistics.
* `build_fixtures`: Create a set of database elements designed for use by unit tests in a fixture (see `make test-fixtures`);
* `from_csv`: Import manifests and index files from a CSV list;
* `import_annotations`: Import index files from a folder into a specific volume;
* `import_acts`: Import XML surface files and CSV act files;
* `delete_corpus`: Delete a big corpus using a Ponos task;
* `reindex`: Run asynchronous tasks on the Celery worker to reindex transcriptions in ElasticSearch;
* `telegraf`: A special command with InfluxDB-compatible output for Grafana statistics.
See `manage.py <command> --help` to view more details about a specific command.
@@ -114,9 +116,9 @@ See `manage.py <command> --help` to view more details about a specific command.
Once your code appears to be working on a local server, a few checks have to be performed:
- **Migrations:** Ensure that all migrations have been created by typing `./manage.py makemigrations`.
- **Unit tests:** Run `./manage.py test` to perform unit tests.
- Use `./manage.py test module_name` to perform tests on a single module, if you wish to spend less time waiting for all tests to complete.
* **Migrations:** Ensure that all migrations have been created by typing `./manage.py makemigrations`.
* **Unit tests:** Run `./manage.py test` to perform unit tests.
- Use `./manage.py test module_name` to perform tests on a single module, if you wish to spend less time waiting for all tests to complete.
### Linting
@@ -143,9 +145,9 @@ IPython will give you a nicer shell with syntax highlighting, auto reloading and
[Django Debug Toolbar](https://django-debug-toolbar.readthedocs.io/en/latest/) provides you with a neat debug sidebar that will help diagnosing slow API endpoints or weird template bugs. Since the Arkindex frontend is completely decoupled from the backend, you will need to browse to an API endpoint to see the debug toolbar.
[Django Extensions](https://django-extensions.readthedocs.io/en/latest/) adds a _lot_ of `manage.py` commands ; the most important one is `./manage.py shell_plus` which runs the usual shell but with all the available models pre-imported. You can add your own imports with the `local_settings.py` file. Here is an example that imports most of the backend's enums and some special QuerySet features:
[Django Extensions](https://django-extensions.readthedocs.io/en/latest/) adds a *lot* of `manage.py` commands ; the most important one is `./manage.py shell_plus` which runs the usual shell but with all the available models pre-imported. You can add your own imports with the `local_settings.py` file. Here is an example that imports most of the backend's enums and some special QuerySet features:
```python
``` python
SHELL_PLUS_POST_IMPORTS = [
('django.db.models', ('Value', )),
('django.db.models.functions', '*'),
@@ -174,7 +176,7 @@ You may want to also uninstall `django-nose`, as it is an optional test runner t
We use [rq](https://python-rq.org/), integrated via [django-rq](https://pypi.org/project/django-rq/), to run tasks without blocking an API request or causing timeouts. To call them in Python code, you should use the trigger methods in `arkindex.project.triggers`; those will do some safety checks to make catching some errors easier in dev. The actual tasks are in `arkindex.documents.tasks`. The following tasks exist:
- Delete a corpus: `corpus_delete`
- Reindex elements, transcriptions or entities into ElasticSearch: `reindex_start`
* Delete a corpus: `corpus_delete`
* Reindex elements, transcriptions or entities into ElasticSearch: `reindex_start`
To run them, use `make worker` to start a RQ worker. You will need to have Redis running; `make slim` or `make` in the architecture will provide it. `make` in the architecture also provides a RQ worker running in Docker from a binary build.
Loading