Support all kinds of region
There is an old TODO, but there are even more types supported by the PageXML format.
We should support SeparatorRegion, TableRegion, GraphicRegion, MusicRegion and NoiseRegion. All parsed as xxx_regions
There is an old TODO, but there are even more types supported by the PageXML format.
We should support SeparatorRegion, TableRegion, GraphicRegion, MusicRegion and NoiseRegion. All parsed as xxx_regions