Skip to content

Merge the Transkribus and file imports

  • Remove arkindex_tasks.export_transkribus
  • Move the Transkribus import code from arkindex_tasks.import_transkribus to arkindex_tasks.import_files.transkribus, ignoring the main()
  • Move arkindex_tasks.pagexml to arkindex_tasks.import_files.pagexml
  • Remove support for job paths from the TranskribusImporter
  • Have the TranskribusImporter return its generated elements instead of dumping elements.json file by itself
  • Add code in arkindex_tasks.import_files to:
    • Detect that a file is a ZIP archive
    • Try to open the ZIP archive
    • If the ZIP archive contains a mets.xml, assume that it is a Transkribus archive and import it using the TranskribusImporter
      (Yes, we do not support a generic METS import, but mets.xml is the only file that should always be found in every Transkribus export)
    • If it does not contain mets.xml, die in mysterious circumstances (for now)
Edited by Erwan Rouchet