Newer
Older
This script downloads pages with transcriptions from Arkindex
It also generates reproducible train, val and test splits.
A documentation is available at https://teklia.gitlab.io/atr/data-generator/.
`ARKINDEX_API_TOKEN` and `ARKINDEX_API_URL` environment variables must be defined.
You can create an alias by adding this line to your `~/.bashrc`:
```sh
alias set_demo='export ARKINDEX_API_URL=https://demo.arkindex.org/;export ARKINDEX_API_TOKEN=my_api_token'
```
Then run: