New endpoint to create a training process
We need a new endpoint to create a process that will run a training worker through ponos.
The endpoint is named CreateTrainingProcess
, available through POST
and has the following input payload:
-
name
required, for the process, so we are aware of what is happening -
corpus_id
required -
train_folder_id
required, an element in the specified corpus above that is a folder -
test_folder_id
optional, an element in the specified corpus above that is a folder -
worker_version_id
required, is the worker that will be used to train a new model -
model_version_id
nullable, is a potential model to train on top -
worker_configuration_id
nullable, is a potential worker configuration to apply on the worker being started -
use_gpu
is a boolean, default to False
So you'll need to add a few fields on DataImport
to store new data:
- nullable FK
train_folder_id
- nullable FK
test_folder_id
Also, the DataImport.mode
will need to have a new value Training
. This requires a MR on the frontend and tasks to update the enums.
The process being created has several constants:
- mode is set to
Training
- only one chunk is created
- a ponos workflow is created with a single task:
- using worker version mentioned
- and optional worker configuration
- following the GPU usage
- the default docker command is used (provided by the worker version image)
- the dataimport id must be exposed to the workflow as an env variable (this will allow to retrieve the details of the dataimport)
The new endpoint simply returns the ID of the created dataimport.
The RetrieveDataImport
endpoint must be extended to export the new fields from DataImport
.
Edited by Erwan Rouchet