necessary changes for pagexml parsing
This MR contains some changes to the pagexml and xml scripts in order to make them work with different pagexml versions
the code now accepts any pagexml file that contains a valid namespace (a namespace that starts with http://schema.primaresearch.org/PAGE/gts/pagecontent/
) and gets a region's coordinates whether the coordinates are in a points
attribute or some Point
subelements.