Skip to content

Make `teklia-dan dataset extract` compatible with latest Arkindex exports

I got the following error with a new import (2024-08-06):

(dan) starride@straton:~/data$ teklia-dan dataset extract hugin-munin-norhand-v3-20240806-204721.sqlite --output norhand_fully_automatic --allow-empty --dataset-id 2cb2a75d-3a9a-4443-b4d5-6d5c30606974 --element-type text_line 
Extracting data from (2cb2a75d-3a9a-4443-b4d5-6d5c30606974) for split (train): 0it [00:00, ?it/s]
Extracting data from (2cb2a75d-3a9a-4443-b4d5-6d5c30606974) for split (test):   0%|                | 0/1599 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 3252, in execute_sql
    cursor.execute(sql, params or ())
sqlite3.OperationalError: no such column: t1.worker_version_id

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/users/starride/miniconda3/envs/dan/bin/teklia-dan", line 8, in <module>
    sys.exit(main())
  File "/home/users/starride/git_repos/dan/dan/cli.py", line 31, in main
    status = args.pop("func")(**args)
  File "/home/users/starride/git_repos/dan/dan/datasets/extract/arkindex.py", line 261, in run
    ).run()
  File "/home/users/starride/git_repos/dan/dan/datasets/extract/arkindex.py", line 213, in run
    self.process_parent(
  File "/home/users/starride/git_repos/dan/dan/datasets/extract/arkindex.py", line 157, in process_parent
    parent = dataset_parent.element
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 4657, in __get__
    return self.get_rel_instance(instance)
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 4648, in get_rel_instance
    obj = self.rel_model.get(self.field.rel_field == value)
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 6712, in get
    return sq.get()
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 7160, in get
    return clone.execute(database)[0]
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 1972, in inner
    return method(self, database, *args, **kwargs)
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 2043, in execute
    return self._execute(database)
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 2216, in _execute
    cursor = database.execute(self)
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 3260, in execute
    return self.execute_sql(sql, params)
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 3250, in execute_sql
    with __exception_wrapper__:
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 3020, in __exit__
    reraise(new_type, new_type(exc_value, *exc_args), traceback)
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 192, in reraise
    raise value.with_traceback(tb)
  File "/home/users/starride/miniconda3/envs/dan/lib/python3.10/site-packages/peewee.py", line 3252, in execute_sql
    cursor.execute(sql, params or ())
peewee.OperationalError: no such column: t1.worker_version_id