Skip to content
Snippets Groups Projects
Commit 961566b1 authored by Yoann Schneider's avatar Yoann Schneider :tennis:
Browse files

Merge branch 'ext-tar-zst' into 'main'

Update dataset archive extension to `.tar.zst`

Closes #9

See merge request workers/generic-training-dataset!13
parents b017a586 e6c732ea
No related branches found
No related tags found
1 merge request!13Update dataset archive extension to `.tar.zst`
Pipeline #141490 passed
......@@ -355,11 +355,11 @@ class DatasetExtractor(DatasetWorker):
casted_elements = list(map(_format_element, elements))
self.process_split(split_name, casted_elements)
# TAR + ZSTD the cache and the images folder, and store as task artifact
zstd_archive_path: Path = self.work_dir / f"{dataset.id}.zstd"
logger.info(f"Compressing the images to {zstd_archive_path}")
# TAR + ZST the cache and the images folder, and store as task artifact
zst_archive_path: Path = self.work_dir / f"{dataset.id}.tar.zst"
logger.info(f"Compressing the images to {zst_archive_path}")
create_tar_zst_archive(
source=self.data_folder_path, destination=zstd_archive_path
source=self.data_folder_path, destination=zst_archive_path
)
self.data_folder.cleanup()
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment