Skip to content
Snippets Groups Projects
Commit b85a3ea6 authored by ml bonhomme's avatar ml bonhomme :bee: Committed by Bastien Abadie
Browse files

Document training models

parent 8f56f47b
No related branches found
No related tags found
1 merge request!87Document training models
Pipeline #13881 passed
Showing
with 83 additions and 8 deletions
......@@ -2,7 +2,7 @@
title = "Annotate documents"
description = "Use Arkindex to annotate your documents"
weight = 60
weight = 90
+++
The general annotation process is as follows:
......
+++
title = "Export a project"
weight = 60
weight = 100
+++
This page will guide you through exporting a project as an SQLite database.
......
......@@ -2,7 +2,7 @@
title = "Deploy Arkindex on-premise"
description = "Deploy Arkindex on your own infrastucture"
weight = 100
weight = 110
+++
......
......@@ -2,7 +2,7 @@
title = "Run a Machine Learning process"
description = "Use Arkindex to annotate your documents"
weight = 30
weight = 40
+++
Arkindex aims to provide **Machine Learning tools** to process your documents.
......
......@@ -2,7 +2,7 @@
title = "Use workflow templates"
description = "Save and use workflow templates"
weight = 30
weight = 60
+++
Some complex **Machine Learning processes** can be created using multiple workers. When creating a process, you need to add each worker to your process and setup dependencies. This step presents two drawbacks:
......
content/howto/train_model/actions_train.png

63.3 KiB

content/howto/train_model/folder_picker.png

141 KiB

+++
title = "Train a Machine Learning model"
description = "Use Arkindex to train a Machine Learning model"
weight = 50
+++
You can use Arkindex to train machine learning models for Arkindex's workers, using annotated data from any Arkindex project you have access to. There must be within this project **at least one folder**: the folder containing the training data.
You can also use optional validation and test folders.
To start a training process for a given Model, on a given Project, you need a **contributor** access to the Project, and an **admin** access to the Model.
The training interface can be accessed from the **Actions** dropdown menu on the right of the header of a project.
{{ figure(image="howto/train_model/actions_train.png", height=260, caption="'Train a model' in the Actions menu") }}
## The training interface
In order to train a Machine Learning model, you have to set a number of parameters in the training process configuration form.
{{ figure(image="howto/train_model/training_form.png", height=600, caption="The training process configuration form") }}
### Naming your training process
First, you have to name your training process. This will be useful to find it again in the processes list, if you navigate away from the process status page.
### Selecting a worker version
Then, you need to select the worker version that will perform the training. For example, if you want to train a model for Doc-UFCN, you need to select the latest available version for this worker in the worker version selection modal. The trained model, once it's finished training, will be available to be used in Machine Learning processes using this worker.
{{ figure(image="howto/train_model/version_picker.png", height=500, caption="Worker version selection") }}
### Configuring the training process
You can (optionnally) add a training configuration to your training process. You can either select an existing configuration, or create a new configuration, using the configuration modal.
{{ figure(image="howto/train_model/training_config.png", height=350, caption="Training configuration") }}
### Selecting a model to train
You have to select the model you will be training, among the available models. You can also, optionally, select a model version to start your training from.
{{ figure(image="howto/train_model/model_selection.png", height=500, caption="Model selection modal") }}
### Training, validation and test folders
You have to select a training folder, containing the data you want to train your model on, from the existing folders in the corpus you've chosen to train a model on. You can also select a validation and a test folder.
{{ figure(image="howto/train_model/folder_picker.png", height=350, caption="Folder picker modal") }}
The data contained in the training and validation folders is used to train the model, while the data contained is the test folder is never used during the traing process, and only serves to test it on totally new data to evaluate its performance.
### GPU usage
Lastly, you can chose to use GPU or not to train your model, using the GPU toggle.
You can then click the **Start training** button.
## Training Process Status
This takes you to a process status page, similar to the one with which you can follow the process of a Workers workflow. You can leave this status page, and find it again in the Processes List (`/process`). This list can be filtered with various parameters, including the process name, so you can easily find your training process again and monitor it.
Once the training has been successfully completed, you new model is available to use in [Machine Learning Processes](../run-process/).
content/howto/train_model/model_selection.png

127 KiB

content/howto/train_model/training_config.png

91.5 KiB

content/howto/train_model/training_form.png

123 KiB

content/howto/train_model/version_picker.png

221 KiB

+++
title = "Import a Transkribus collection"
description = "Import documents and annotation from Transkribus to Arkindex"
weight = 50
weight = 80
+++
As an Arkindex user, it is possible to import your [Transkribus collections](https://transkribus.eu/) on the Arkindex platform. There you will find the architecture of your collection (folder, page) and the transcriptions you have made.
......
......@@ -2,7 +2,7 @@
title = "Upload images on Teklia's storage"
description = "Uploading images directly to a Teklia IIIF server managed with Ceph"
weight = 35
weight = 70
+++
Teklia provides a new service for uploading documents and images hosted with [Ceph Object Gateway](https://docs.ceph.com/en/latest/radosgw/). It is much more similar to the standard IIIF upload but with some slight changes depending on the tool used. The account creation and the utilization of images after the upload is similar with the other Teklia's IIIF services.
......
......@@ -89,6 +89,7 @@ Please note that as for other resources, at least one user or group must have an
| Manage members | ❌ | ❌ | ❌ | ✅ |
| Delete elements | ❌ | ❌ | ❌ | ✅ |
| Start a Machine Learning process | ❌ | ❌ | ❌ | ✅ |
| Start a Machine Learning Training process | ❌ | ❌ | ✅ | ✅ |
### Repository access
......@@ -163,7 +164,7 @@ A user can see the Machine Learning models they have access to from **My models*
Guests can only see available versions with a set tag while contributors (or admins) can see all of them.
If you have admin rights on the model, you can delete its versions and manage its rights.
If you have admin rights on the model, you can delete its versions and manage its rights. You can also create a new model version by creating a [training process](../../howto/train-model/) for this Model.
{{ figure(image="users/rights/models-list.png", height=330, caption="Models management page") }}
#### Models permission table
......@@ -175,3 +176,13 @@ If you have admin rights on the model, you can delete its versions and manage it
| List its versions | ❌ | ❌ | ✅ | ✅ |
| Delete a version | ❌ | ❌ | ❌ | ✅ |
| Manage members | ❌ | ❌ | ❌ | ✅ |
#### Training processes permission table
| action | no right | guest | contributor | admin |
|-----------------------------------------------|----------|-------|-------------|--------|
| See a training process | ❌ | ✅ | ✅ | ✅ |
| Configure and start a training process | ❌ | ❌ | ❌ | ✅ |
| Stop a running training process | ❌ | ❌ | ❌ | ✅ |
| Retry a failed training process | ❌ | ❌ | ❌ | ✅ |
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment