Cleanup dataset archive after processing
We download the archive's dataset before the call to DatasetWorker.process_dataset
. Usually it's download in self.task_data_dir / "extra_files"
where it's not an issue at all. However, when this folder doesn't exist, it's downloaded in the worker's work_dir
folder. That's an issue. We should always cleanup instead.
After process is finished, we should remove the archive at self.find_extras_directory() / dataset.filepath
.