Tutorial to train a YOLO segmentation model

adc724d4 · Yoann Schneider · Bastien Abadie · c697ce22 · c697ce22 · adc724d4
Commit adc724d4 authored 11 months ago by Yoann Schneider Committed by Bastien Abadie 11 months ago
--- a/content/howto/export/action.png
+++ b/content/howto/export/action.png
--- a/content/howto/export/modal.png
+++ b/content/howto/export/modal.png
--- a/content/howto/export/modal_empty.png
+++ b/content/howto/export/modal_empty.png
--- a/content/tutorial/segmentation-training.md
+++ b/content/tutorial/segmentation-training.md
@@ -4,6 +4,224 @@ weight = 60
 draft = true
 +++

- train yolo / Doc-ufcn on lines
- build lines
- evaluation
\ No newline at end of file
+In this tutorial, you will learn how to train a segmentation model in Arkindex.
+
+This section is to be followed and carried out after creating the [ground-truth annotations](@/tutorial/segmentation-ground-truth.md).
+
+## Make your dataset immutable
+
+The `Pellet` dataset created during [the data partitioning step](@/tutorial/corpus.md#data-partitioning) and annotated on [Callico](@/tutorial/segmentation-ground-truth.md), now has three sets named `train`, `val` and `test`.
+
+To avoid any issues ([data leakage][1], accidental deletions, ...), we should lock the state of our dataset. To make it immutable, we will:
+1. generate an SQLite export of the corpus,
+2. save a snapshot of the elements contained in the dataset.
+
+### Generating an export of the corpus
+
+Browse to the page of the `Europeana | Pellet` corpus. Instructions to start an SQLite export are detailed in [our guide to export a corpus](@/howto/export/index.md#to-start-an-export). Once started, read how to [monitor the status of your export](@/howto/export/index.md#monitoring-an-export).
+
+Wait for your export to be **Done** before going further in this tutorial. 
+
+### Locking the dataset
+
+Once the export is ready, we will now lock the dataset. 
+
+{% warning() %}
+After this operation is done, you will not be able to edit the dataset or its sets anymore.
+{% end %}
+
+Browse to the page of the `Europeana | Pellet` corpus and click **Create dataset process** in the **Actions** dropdown menu.
+
+{{ figure(image="tutorial/training/segmentation/create.png", height=500, caption="Create a new dataset process") }}
+
+Select the `Pellet` dataset and select `All sets` to add all three sets to the process. 
+
+{{ figure(image="tutorial/training/segmentation/select.png", height=500, caption="Select the dataset and its sets") }}
+
+Press the button in **Actions** column to confirm.
+
+{{ figure(image="tutorial/training/segmentation/confirm.png", height=500, caption="The dataset and its sets are selected") }}
+
+Use the **Configure workers** button to move on to worker selection.
+Press the **Select workers** button, search for `Generic Training Dataset Extractor` and press the **Enter** keyboard key.
+
+{{ figure(image="tutorial/training/segmentation/worker_search.png", height=300, caption="Search for the Generic Training Dataset Extractor worker") }}
+
+Click on the name of the worker on the right and select the first version with `main` reference listed by clicking on the button in the **Actions** column.
+
+{{ figure(image="tutorial/training/segmentation/worker_add.png", height=500, caption="Add the Generic Training Dataset Extractor worker to the process") }}
+
+Close the modal by clicking on the **Done** button on the bottom left.
+
+Your process should be configured and ready to go.
+
+{{ figure(image="tutorial/training/segmentation/extract_process.png", height=500, caption="Configured process, ready to launch") }}
+
+Click on the **Run process** button to launch the process.
+
+{{ figure(image="tutorial/training/segmentation/process_running.png", height=500, caption="Dataset processing in progress") }}
+
+Wait till the process is over. 
+
+If you browse to the page of your dataset, you should see **Complete** next to the dataset's state.
+
+{{ figure(image="tutorial/training/segmentation/complete_dataset.png", height=100, caption="Dataset is now complete, ready for training") }}
+
+{% info() %}
+This operation is only done once per dataset. If you need to train another model, as long as the annotations were already present on the element at this point, you will not have to regenerate the dataset like this.
+
+However, if you added information on the elements, you will need to:
+1. Clone the dataset (to keep the data partitioning)
+2. Make that new dataset immutable, following [this section's instructions](#make-your-dataset-immutable).
+{% end %}
+
+## Create a model
+
+The training will save the model's files as a new version on Arkindex. In this section, we will create the model that hold this new version.
+
+Click on **Models** in the top right dropdown (with your email address).
+
+{{ figure(image="tutorial/training/segmentation/model_navbar.png", height=100, caption="Browse to the Models page") }}
+
+Click on the **Create a model** button to create a new model. This will open a new page where you can fill your model's information. It is a good idea to name the model after:
+- the machine learning technology used,
+- the dataset,
+- the type of element present in the dataset.
+
+In our case, we are training:
+- a [YOLO v8](https://docs.ultralytics.com) model,
+- on the **Pellet** dataset,
+- on **page** elements.
+
+A suitable name would be `YOLO | Pellet (page)`.
+In the description, you can add a link towards the dataset on [Europeana](https://europeana.transcribathon.eu/documents/story/?story=121795). The description supports [Markdown](https://spec.commonmark.org/) input.
+
+{{ figure(image="tutorial/training/segmentation/create_model.png", height=500, caption="Create a new model") }}
+
+
+{% info() %}
+A model can hold multiple versions. If you do another training under different conditions (for a longer period of time, ...), you do not have to create a brand new model again.
+{% end %}
+
+## Start your training process
+
+Now that you have a dataset and a model, we can create the training process.
+
+Create a process using all sets of the dataset. The procedure is the same as before, when we [locked the dataset](#locking-the-dataset).
+
+The state of the dataset has changed, you should now have the following process selection.
+
+{{ figure(image="tutorial/training/segmentation/train_process_select.png", height=500, caption="The dataset and its sets are selected") }}
+
+Proceed to workers configuration. Press the **Select workers** button, search for `YOLO Training | Detect/Segment` and press the **Enter** keyboard key.
+
+Click on the name of the worker on the right and select the first version listed by clicking on the button in the **Actions** column.
+
+{{ figure(image="tutorial/training/segmentation/train_worker_add.png", height=500, caption="Add the YOLO Training | Detect/Segment worker to the process") }}
+
+Close the modal by clicking on the **Done** button on the bottom left.
+
+Configure the `YOLO Training | Detect/Segment` worker by clicking on the button in the **Configuration** column. This will open a new modal, where you can pass specific parameters used for training. The full description of the fields is available on [the worker's description page](https://demo.arkindex.org/process/workers/ba10ef9d-4f96-4245-910c-23387ce6921a).
+
+Select **New configuration** on the left column, to create a new configuration. Again, name it after the dataset you are using.
+
+{{ figure(image="tutorial/training/segmentation/train_configuration.png", height=500, caption="Worker configuration") }}
+
+The most important parameters are:
+- *Model that will receive the new trained version*: search for the name of [your model](#create-a-model),
+- *Number of epochs[^epoch] to train[^training] the model*: the default value is good enough but you can set it to a larger number if you want to train for a longer period time,
+- *Type of object to detect using the segmenter*: 
+  - a segmenter will produce masks (polygons),
+  - a detector will produce bounding boxes (rectangles).
+
+Click on **Create** then **Save** when you are done filling the fields. Your process is ready to go.
+
+{{ figure(image="tutorial/training/segmentation/train_configured_process.png", height=500, caption="Configured process, ready to launch") }}
+
+Click on the **Run process** button to launch the process.
+
+{{ figure(image="tutorial/training/segmentation/train_process_running.png", height=500, caption="Training process is running") }}
+
+While it is running, the logs of the tasks are displayed. Multiple things happen during this process:
+1. The dataset is converted into the right format for [YOLO v8 segmentation models](https://docs.ultralytics.com/datasets/segment/),
+2. Training starts, for as long as needed,
+3. Evaluation on all three splits,
+4. Publication of the model on Arkindex.
+
+Training artifacts will be available when the process is finished. Click on the **Artifacts** button to list artifacts and click on `results.tar.gz` to download it.
+
+{{ figure(image="tutorial/training/segmentation/train_download_artifacts.png", height=500, caption="Download training artifacts") }}
+
+This archive has the following file structure:
+- one `train` folder: with graphs and sheets describing how the training went,
+- three `eval_*` folders: the evaluation results on all splits.
+
+Make sure to take a look at these if you want to know more about your model's performance.
+
+Visit the page of your model to see your brand new trained model version. To do so, browse the **Models** page and search for your model.
+
+{{ figure(image="tutorial/training/segmentation/train_check_version.png", height=500, caption="The new model version is displayed under the model") }}
+
+You can download it to use it on your own or you can use it to process pages already on Arkindex, as described in the next section.
+
+## Evaluation
+
+Graphs are nice to get an idea of how the model performs on unknown data. However, it is easier to make yourself an idea when the predictions are actually displayed.
+
+In this section, you will learn to process the test set of your dataset with your newly trained model.
+
+### Creating the process
+
+Browse to the folder containing the elements of the dataset, in the corpus you created in the [earlier steps of the tutorial](@/tutorial/corpus.md#data-partitioning).
+
+{{ figure(image="tutorial/training/segmentation/inference_test_set.png", height=500, caption="Folders containing the elements of the dataset.") }}
+
+Click on the `test` folder. Elements in the `test` set will be displayed. Then click on **Create process** in the **Actions** menu.
+
+{{ figure(image="tutorial/training/segmentation/inference_create_process.png", height=500, caption="Create a process on the `test` set.") }}
+
+Filter element by type `page` and trigger the **Load children** toggle to display `page` elements.
+
+{{ figure(image="tutorial/training/segmentation/inference_filter_elements.png", height=500, caption="Create a process on the `test` set.") }}
+
+Click on **Configure workers** to move on to worker selection. Press the **Select workers** button, search for `YOLO V8 Segmenter` and press the **Enter** keyboard key. Just like we did in the previous sections, click on the name of the worker on the right and select the first version listed by clicking on the button in the **Actions** column.
+
+{{ figure(image="tutorial/training/segmentation/inference_worker_add.png", height=500, caption="Add the YOLO V8 Segmenter worker to the process") }}
+
+Close the modal by clicking on the **Done** button on the bottom left.
+
+Now it is time to select the model you trained. Click on the button in the **Model version** column. In the modal that opens:
+1. Trigger the **Show all models** toggle,
+2. Look for the name of your trained model,
+3. Add the model version by clicking on **Use** in the **Actions** column,
+4. Close the modal by clicking on **Ok**, in the bottom left corner.
+
+{{ figure(image="tutorial/training/segmentation/inference_model_select.png", height=500, caption="Add your trained model to the process") }}
+
+The process is ready and you can launch it using the **Run process** button. Wait for its completion before moving to the next step.
+
+### Visualizing predictions
+
+To see the predictions of your model, browse back to the `test` folder in your corpus. There you can click on the first page displayed.
+
+{{ figure(image="tutorial/training/segmentation/inference_full_page.png", height=500, caption="A page of the test set with annotations and predictions") }}
+
+On all pages of the `test` set, you can see both the annotations and the predictions. To know the difference between ground truth and annotations, look at the details of each element.
+
+{{ figure(image="tutorial/training/segmentation/inference_gt_line.png", height=100, caption="A text line annotated by a human") }}
+
+{{ figure(image="tutorial/training/segmentation/inference_pred_line.png", height=100, caption="A text line predicted by the model") }}
+
+On the elements annotated by humans, **Callico** is mentioned. On the predicted elements, **YOLO** is mentioned. The confidence score of the YOLO prediction is also displayed.
+
+## Next step
+
+Now that you have a segmentation model, you can [generate ground truth annotations to train a transcription model](@/tutorial/transcription-ground-truth.md).
+
+[1]: <https://en.wikipedia.org/wiki/Leakage_(machine_learning)>
+
+---
+
+[^epoch]: An epoch corresponds to one complete pass of the training dataset through the algorithm. The performance of the model generally increases with the number of epochs. However, the model will eventually stop learning at some point so specifying a very high number might waste time.
+
+[^training]: Model training in machine learning is the process of feeding an ML algorithm with data to help identify and learn good values for all of its parameters.
--- a/content/tutorial/training/segmentation/complete_dataset.png
+++ b/content/tutorial/training/segmentation/complete_dataset.png
--- a/content/tutorial/training/segmentation/confirm.png
+++ b/content/tutorial/training/segmentation/confirm.png
--- a/content/tutorial/training/segmentation/create.png
+++ b/content/tutorial/training/segmentation/create.png
--- a/content/tutorial/training/segmentation/create_model.png
+++ b/content/tutorial/training/segmentation/create_model.png
--- a/content/tutorial/training/segmentation/extract_process.png
+++ b/content/tutorial/training/segmentation/extract_process.png
--- a/content/tutorial/training/segmentation/inference_create_process.png
+++ b/content/tutorial/training/segmentation/inference_create_process.png
--- a/content/tutorial/training/segmentation/inference_filter_elements.png
+++ b/content/tutorial/training/segmentation/inference_filter_elements.png
--- a/content/tutorial/training/segmentation/inference_full_page.png
+++ b/content/tutorial/training/segmentation/inference_full_page.png
--- a/content/tutorial/training/segmentation/inference_gt_line.png
+++ b/content/tutorial/training/segmentation/inference_gt_line.png
--- a/content/tutorial/training/segmentation/inference_model_select.png
+++ b/content/tutorial/training/segmentation/inference_model_select.png
--- a/content/tutorial/training/segmentation/inference_pred_line.png
+++ b/content/tutorial/training/segmentation/inference_pred_line.png
--- a/content/tutorial/training/segmentation/inference_test_set.png
+++ b/content/tutorial/training/segmentation/inference_test_set.png
--- a/content/tutorial/training/segmentation/inference_worker_add.png
+++ b/content/tutorial/training/segmentation/inference_worker_add.png
--- a/content/tutorial/training/segmentation/model_navbar.png
+++ b/content/tutorial/training/segmentation/model_navbar.png
--- a/content/tutorial/training/segmentation/process_running.png
+++ b/content/tutorial/training/segmentation/process_running.png
--- a/content/tutorial/training/segmentation/select.png
+++ b/content/tutorial/training/segmentation/select.png