New endpoint to create a training process
We need a new endpoint to create a process that will run a training worker through ponos.
The endpoint is named CreateTrainingProcess, available through POST and has the following input payload:
-
namerequired, for the process, so we are aware of what is happening -
corpus_idrequired -
train_folder_idrequired, an element in the specified corpus above that is a folder -
test_folder_idoptional, an element in the specified corpus above that is a folder -
worker_version_idrequired, is the worker that will be used to train a new model -
model_version_idnullable, is a potential model to train on top -
worker_configuration_idnullable, is a potential worker configuration to apply on the worker being started -
use_gpuis a boolean, default to False
So you'll need to add a few fields on DataImport to store new data:
- nullable FK
train_folder_id - nullable FK
test_folder_id
Also, the DataImport.mode will need to have a new value Training. This requires a MR on the frontend and tasks to update the enums.
The process being created has several constants:
- mode is set to
Training - only one chunk is created
- a ponos workflow is created with a single task:
- using worker version mentioned
- and optional worker configuration
- following the GPU usage
- the default docker command is used (provided by the worker version image)
- the dataimport id must be exposed to the workflow as an env variable (this will allow to retrieve the details of the dataimport)
The new endpoint simply returns the ID of the created dataimport.
The RetrieveDataImport endpoint must be extended to export the new fields from DataImport.
Edited by Erwan Rouchet