Skip to content

Replace PDF usage through IIIF by image upload

Refs https://redmine.teklia.com/issues/5125

This only concern PDFs that are contained in ZIP archives.

The pdf import feature from ZIP archive does not work on large PDFs because it relies on Cantaloupe ability to download & parse the PDF in less than 30s (server timeout).

Instead, we need to process each PDF in the ZIP archives to match the behaviour from direct PDF upload:

  1. extract each image (we already have extract_pdf_images)
  2. upload it on S3 bucket for local image exposition
  3. create the IIIF image on the backend (updating the url to point toward the image directly instead of PDF).