Skip to content

Store absolute path of images

Manon Blanco requested to merge format-absolute-path into main

Closes #78 (closed)

I think that checking whether the path is absolute or not is the wrong approach. It's up to the script to generate the correct path.

I got these results with this command teklia-dan dataset format --dataset ./data/ --image-format jpg (so with a relative path)

Before:

{
    "test": {
        "data/images/test/double_page_100ca67f-2d0d-4746-bc69-881cce07860d.jpg": {
            "text": "\u24ddZarnitzki \u24dfRobert \u24d323.3.13 \u24e1139872\n\u24ddLevillain \u24dfRobert \u24d326.10.12 \u24e1139872\n\u24ddCapelle \u24dfErnest \u24d318.4.07 \u24e1139872\n\u24ddClaren \u24dfJean \u24d318.2.12 \u24e1139872\n\u24ddBridet \u24dfCharles \u24d329.10.07 \u24e1139872\n\u24ddFeissel \u24dfRen\u00e9 \u24d327.1.09 \u24e1139872\n\u24ddNetter \u24dfSimon \u24d311.3.01 \u24e1139872\n\u24ddReder \u24dfChaim \u24d321.3.19 \u24e1139872"
        }
    },
    "train": {
        "data/images/train/double_page_107a04df-a547-47bb-9a81-6ca74887abf3.jpg": {
            "text": "\u24ddCynk \u24dfAntoine \u24d327.6.06 \u24e143467\n\u24ddLonc \u24dfBronislaw \u24d319.8.19 \u24e143467\n\u24ddCieranski \u24dfFranciszek \u24d36.10.12 \u24e143467\n\u24ddMariasz \u24dfJulien \u24d32.3.19 \u24e143467\n\u24ddHinkulak \u24dfBronislaw \u24d313.11.17 \u24e143467\n\u24ddPiechota \u24dfFran\u00e7ois \u24d329.9.08 \u24e143467\n\u24ddRyczho \u24dfNicolaj \u24d32.11.10 \u24e143467\n\u24ddBuczkovski \u24dfDominik \u24d31.7.12 \u24e143467"
        }
    },
    "val": {
        "data/images/val/double_page_0103d26f-93f4-4ba0-a0ca-51bb8297e37d.jpg": {
            "text": "\u24ddPont \u24dfAndr\u00e9 \u24d313.6.07 \u24e185105\n\u24ddMiguel \u24dfMartin \u24d325.11.14 \u24e185105\n\u24ddGrousseau \u24dfAuguste \u24d318.4.11 \u24e185105\n\u24ddMolinier \u24dfJoseph \u24d35.11.07 \u24e185105\n\u24ddBottero \u24dfPierre \u24d311.9.15 \u24e185105\n\u24ddPendaries \u24dfMarius \u24d315.4.15 \u24e185105\n\u24ddMarsan \u24dfMarcel \u24d317.9.06 \u24e185105\n\u24ddLescure \u24dfRen\u00e9 \u24d32.1.14 \u24e185105"
        }
    }
}

After:

{
    "test": {
        "/home/users/mblanco/test_prediction/data/images/test/double_page_100ca67f-2d0d-4746-bc69-881cce07860d.jpg": {
            "text": "\u24ddZarnitzki \u24dfRobert \u24d323.3.13 \u24e1139872\n\u24ddLevillain \u24dfRobert \u24d326.10.12 \u24e1139872\n\u24ddCapelle \u24dfErnest \u24d318.4.07 \u24e1139872\n\u24ddClaren \u24dfJean \u24d318.2.12 \u24e1139872\n\u24ddBridet \u24dfCharles \u24d329.10.07 \u24e1139872\n\u24ddFeissel \u24dfRen\u00e9 \u24d327.1.09 \u24e1139872\n\u24ddNetter \u24dfSimon \u24d311.3.01 \u24e1139872\n\u24ddReder \u24dfChaim \u24d321.3.19 \u24e1139872"
        }
    },
    "train": {
        "/home/users/mblanco/test_prediction/data/images/train/double_page_107a04df-a547-47bb-9a81-6ca74887abf3.jpg": {
            "text": "\u24ddCynk \u24dfAntoine \u24d327.6.06 \u24e143467\n\u24ddLonc \u24dfBronislaw \u24d319.8.19 \u24e143467\n\u24ddCieranski \u24dfFranciszek \u24d36.10.12 \u24e143467\n\u24ddMariasz \u24dfJulien \u24d32.3.19 \u24e143467\n\u24ddHinkulak \u24dfBronislaw \u24d313.11.17 \u24e143467\n\u24ddPiechota \u24dfFran\u00e7ois \u24d329.9.08 \u24e143467\n\u24ddRyczho \u24dfNicolaj \u24d32.11.10 \u24e143467\n\u24ddBuczkovski \u24dfDominik \u24d31.7.12 \u24e143467"
        }
    },
    "val": {
        "/home/users/mblanco/test_prediction/data/images/val/double_page_0103d26f-93f4-4ba0-a0ca-51bb8297e37d.jpg": {
            "text": "\u24ddPont \u24dfAndr\u00e9 \u24d313.6.07 \u24e185105\n\u24ddMiguel \u24dfMartin \u24d325.11.14 \u24e185105\n\u24ddGrousseau \u24dfAuguste \u24d318.4.11 \u24e185105\n\u24ddMolinier \u24dfJoseph \u24d35.11.07 \u24e185105\n\u24ddBottero \u24dfPierre \u24d311.9.15 \u24e185105\n\u24ddPendaries \u24dfMarius \u24d315.4.15 \u24e185105\n\u24ddMarsan \u24dfMarcel \u24d317.9.06 \u24e185105\n\u24ddLescure \u24dfRen\u00e9 \u24d32.1.14 \u24e185105"
        }
    }
}
Edited by Manon Blanco

Merge request reports

Loading