Skip to content

Build training metric storage and API endpoints

Refs https://redmine.teklia.com/issues/2483

We are finally building the ML model training metric system 🎉

We'll need 2 more models:

  • training.Metric
    • UUID PK
    • charfield name
    • FK towards training.ModelVersion
    • enum mode:
      • serie (default)
      • point
    • unique together on model_version + name
  • training.MetricValue
    • UUID PK
    • FK towards training.Metric
    • float value
    • datetime created
    • positive integer field nullable step
    • unique together:
      • metric + created
      • metric + step (when step is non null)

Please add an admin for MetricKey, which lists their name & model version, then has an inline for metric values (read only)

Finally, we'll build 2 API endpoints to create metrics:

  • CreateMetric that will allow a user to create a single metric + value. Its fields are:
    • name
    • value
    • model_version_id
    • mode (optional)
    • step (optional)
  • CreateMetrics that will allow a user to create multiple metrics + value at a given step/time. Its fields are:
    • model_version_id
    • step (optional)
    • metrics (list of dicts):
      • name
      • value
      • mode (optional)

Note: no worker version nor worker run is needed on both endpoints, as we can infer the process from the one that is building the model.

Regarding the access rights, it should be the same as creating a training process (either internal user, or contributor on the model).