Skip to content

Full METS import

Bastien Abadie requested to merge mets-full-import into master

This upgrade the existing arkindex upload mets command to fully process METS TOC files that use Alto & images.

This code assumes:

  • images are already exposed through a IIIF server (that means uploaded on ceph bucket that is exposed to our IIIF cluster),
  • the API user has admin access to a corpus

This only supports ALTO as source + images (no PDF, no metadatas).

I tested this using the nice modern BNL newspaper, available on ceph (you can limit to xml files only, no need to download images):

mc cp --recursive ceph/bibliotheque-nationale-luxembourg/2023-04-03/mets_alto/4rbwjj/ .

Then run:

arkindex -p preprod upload mets \
  ../mets_sample/1905940_newspaper_luxland_2007-12-07-mets.xml \
  bc6025ed-aea8-4834-a10a-f2ccbd20cb18 \
  --element-id=3f390299-7534-4943-9862-49dd1849487f \
  --iiif-prefix=bibliotheque-nationale-luxembourg/2023-04-03/mets_alto/4rbwjj/ \
  --dpi-x=300   --dpi-y=300

Grab a coffee while this publishes...

Edited by Bastien Abadie

Merge request reports