Import the export !
We need to move some corpus data from preprod to prod for Ocapi (but some ML experts wanted that feature too).
We can now build a Django management script (named load_export
) to import the export from an instance onto another.
The overall workflow will be:
-
create a new corpus with a name provided as
--name
on CLI (default toData import <date>
) -
create all element types (from table
elements
), the name is the slug with.title()
-
create all image server (from table
image
, usingurl
). Some code fromImageServerManager.form_url
should be used, or simply executed (might not be perf enough) -
create all images, using previously created image servers
-
create all worker versions, respecting the hierarchy:
- get or create a new repo with url
http://data.import
- get or create workers using the provided slug as identifier
- get or create version using the provided ID
-
⚠ we lack some data in the export to do that cleanly
- get or create a new repo with url
-
create all elements using previously created images and types
-
create all transcriptions
-
create all metadata (no allowed here)
-
create all ml class using all distinct classification class names
-
create all classifications
Please note on that issue (as comments) every data bit that lacks and should be added to the export.
Do not import entities for now.