Support new Dataset API
Bump to arkindex-base-worker==0.3.7rc7
.
TODO:
-
process_dataset
should be renamed toprocess_set
(with most of the code fromprocess_split
anyway) -
extract_archive
should only do things if the dataset archive was not extracted before- if
self.dataset_archive
is not set - or if it's set and the name of its parent folder does not end with the ID of the set's dataset
- if
- We might want to store the
cached_dataset
as attribute to avoid reloading it across sets -
run
should not change that much