DataGenerator expects pages with text_lines children, and does not work when applied to folders containing text_lines.
pages
text_lines
For the Belfort project, our folders contain directly text_lines elements (train/val/test). DataGenerator fails to extract the dataset in this case.