Use ImageNet mean and std values
Closes #111 (closed) #114 (closed) #102 (closed)
Needs !171 (merged)
In this MR:
- We use ImageNet's normalization parameters (which give the same results as those calculated on the training set). This saves time before training, as we no longer iterate over all the training data to calculate these parameters, while maintaining similar performance (#111 (closed))
- Thanks to this modification, we can use torchvision's normalization function, but it takes a tensor of size
CxHxW
as input, so we load the data directly using thetorch.read_image
function (#102 (closed)) - The pre-processing transformations are now done on tensors (and not on PIL images), so codes have been updated
- The data pre-processing is done directly after loading images, to reduce memory requirements with the
load_in_memory
parameters - Update of the documentation and tests
Edited by Mélodie Boillet
Merge request reports
Activity
changed milestone to %DAN-P4: Improve data loading and preprocessing
added P2 label
added 10 commits
- 9bbdba9e - Remove grayscale from preprocessing and add it to data augmentation pipeline
- 0f3e6a87 - Reimplement MaxResize to resize only big images
- 56d97810 - Remove deepcopy when I can
- 3ac2f11b - Download images in subresolution using IIIF url
- d34ca68e - add space instead of \n in entities
- 941ae466 - Input prediction folder
- 43cca91a - Use a random transcription of an element when more than one found
- 330b435c - Fix entity splitting
- daf7c86f - Use ImageNet normalization values
- 2ddb4e8b - Use tensors in Resize transformations
Toggle commit listadded 11 commits
- ca471d3c - 1 earlier commit
- ee9fcfdc - Remove deepcopy when I can
- 97231ac8 - Download images in subresolution using IIIF url
- 2e06f94b - add space instead of \n in entities
- 8bb1cd5f - Input prediction folder
- 617c4069 - Use a random transcription of an element when more than one found
- 67fc0bd5 - Fix entity splitting
- be088836 - Use ImageNet normalization values
- 84469a78 - Use tensors in Resize transformations
- f9fdc495 - Normalize test images
- 6a729093 - Fix prediction tests
Toggle commit listHere are the comparison results. On xenarque:
- GPU 0: NVIDIA GeForce RTX 3080 Ti, major=8, minor=6, total_memory=12045MB, multi_processor_count=80
- POPP single page (128 / 16 / 16)
- 50 epochs
- batch = 1 / valid batch = 2
- max char = 1000
- DAN POPP Line fine-tuning on single pages
Param Original Albumentations Albumentations v2 + default mean/std items/s - train ~ 7 ~ 6.5 ~ 6.9 s/items - valid ~ 4.7 ~ 4.6 ~ 4.4 best epoch 40 35 45 best valid cer 0.2973 0.2876 0.2812 test cer 0.3052 0.3162 0.3131 Results are similar on validation set, we can use these ImageNet normalization params.
Prediction tests are now fixed. I'm waiting !171 (merged) to be merged before updating
- training tests
- documentation
- DAN worker (Remove this line https://gitlab.com/teklia/workers/dan/-/blob/main/worker_dan/worker.py#L411)
added 32 commits
-
6a729093...829ad8f7 - 6 commits from branch
main
- 829ad8f7...16043a04 - 16 earlier commits
- 0df80aa4 - Fix typos and remove useless code
- fff02535 - Update tests with new transforms/resize
- 4e40889d - Preprocess with torchvision transforms
- a1e25549 - Use imgaug for data augmentation
- 69a7ebb9 - Simlify PIL/numpy conversion
- 6080e98e - Reimplement MaxResize to resize only big images
- 8adfe041 - Use ImageNet normalization values
- 6be2654c - Use tensors in Resize transformations
- ae1673af - Normalize test images
- fe9a0876 - Fix prediction tests
Toggle commit list-
6a729093...829ad8f7 - 6 commits from branch
added 11 commits
-
78e6f601 - 1 commit from branch
main
- 3a252197 - Preprocess with torchvision transforms
- b685b84d - Use imgaug for data augmentation
- a92193a3 - Simlify PIL/numpy conversion
- eac6c5d9 - Reimplement MaxResize to resize only big images
- 5138e979 - Use ImageNet normalization values
- c9cb5326 - Use tensors in Resize transformations
- 7c320449 - Normalize test images
- b7a4cf74 - Fix prediction tests
- 96666030 - Fix linting
- a1ca620d - Remove imgaug from requirements
Toggle commit list-
78e6f601 - 1 commit from branch
mentioned in issue #110 (closed)
mentioned in issue #117 (closed)
mentioned in issue #122 (closed)
mentioned in issue #124 (closed)
Please register or sign in to reply