Use ImageNet mean and std values (!183) · Merge requests · Automatic Text Recognition / DAN · GitLab

Snippets Groups Projects

Merged Mélodie Boillet requested to merge use-default-mean-std into main 1 year ago

Closes #111 (closed) #114 (closed) #102 (closed)

Needs !171 (merged)

In this MR:

We use ImageNet's normalization parameters (which give the same results as those calculated on the training set). This saves time before training, as we no longer iterate over all the training data to calculate these parameters, while maintaining similar performance (#111 (closed))
Thanks to this modification, we can use torchvision's normalization function, but it takes a tensor of size CxHxW as input, so we load the data directly using the torch.read_image function (#102 (closed))
The pre-processing transformations are now done on tensors (and not on PIL images), so codes have been updated
The data pre-processing is done directly after loading images, to reduce memory requirements with the load_in_memory parameters
Update of the documentation and tests

Edited 1 year ago by Mélodie Boillet

Activity

Mélodie Boillet changed milestone to %DAN-P4: Improve data loading and preprocessing 1 year ago

changed milestone to %DAN-P4: Improve data loading and preprocessing
Mélodie Boillet added P2 label 1 year ago

added P2 label
Mélodie Boillet assigned to @melodie.boillet 1 year ago

assigned to @melodie.boillet
Mélodie Boillet added 10 commits 1 year ago
added 10 commits

9bbdba9e - Remove grayscale from preprocessing and add it to data augmentation pipeline

0f3e6a87 - Reimplement MaxResize to resize only big images

56d97810 - Remove deepcopy when I can

3ac2f11b - Download images in subresolution using IIIF url

d34ca68e - add space instead of \n in entities

941ae466 - Input prediction folder

43cca91a - Use a random transcription of an element when more than one found

330b435c - Fix entity splitting

daf7c86f - Use ImageNet normalization values

2ddb4e8b - Use tensors in Resize transformations

Compare with previous version
Toggle commit list
Mélodie Boillet added 1 commit 1 year ago
added 1 commit

3d6701b8 - Normalize test images

Compare with previous version
Mélodie Boillet added 11 commits 1 year ago
added 11 commits

ca471d3c - 1 earlier commit

ee9fcfdc - Remove deepcopy when I can

97231ac8 - Download images in subresolution using IIIF url

2e06f94b - add space instead of \n in entities

8bb1cd5f - Input prediction folder

617c4069 - Use a random transcription of an element when more than one found

67fc0bd5 - Fix entity splitting

be088836 - Use ImageNet normalization values

84469a78 - Use tensors in Resize transformations

f9fdc495 - Normalize test images

6a729093 - Fix prediction tests

Compare with previous version
Toggle commit list

Mélodie Boillet @mboillet · 1 year ago

Author Maintainer

Here are the comparison results. On xenarque:

GPU 0: NVIDIA GeForce RTX 3080 Ti, major=8, minor=6, total_memory=12045MB, multi_processor_count=80
POPP single page (128 / 16 / 16)
50 epochs
batch = 1 / valid batch = 2
max char = 1000
DAN POPP Line fine-tuning on single pages

Param	Original	Albumentations	Albumentations v2 + default mean/std
items/s - train	~ 7	~ 6.5	~ 6.9
s/items - valid	~ 4.7	~ 4.6	~ 4.4
best epoch	40	35	45
best valid cer	0.2973	0.2876	0.2812
test cer	0.3052	0.3162	0.3131

Results are similar on validation set, we can use these ImageNet normalization params.

Mélodie Boillet @mboillet · 1 year ago

Author Maintainer

I tested to load an old model, and run the evaluation, the results with the computed mean and std, and default mean and std are the same.

I also updated the prediction tests, results are equal.
Mélodie Boillet @mboillet · 1 year ago

Author Maintainer
Prediction tests are now fixed. I'm waiting !171 (merged) to be merged before updating

training tests

documentation

DAN worker (Remove this line https://gitlab.com/teklia/workers/dan/-/blob/main/worker_dan/worker.py#L411)
Mélodie Boillet added 32 commits 1 year ago
added 32 commits

6a729093...829ad8f7 - 6 commits from branch main

829ad8f7...16043a04 - 16 earlier commits

0df80aa4 - Fix typos and remove useless code

fff02535 - Update tests with new transforms/resize

4e40889d - Preprocess with torchvision transforms

a1e25549 - Use imgaug for data augmentation

69a7ebb9 - Simlify PIL/numpy conversion

6080e98e - Reimplement MaxResize to resize only big images

8adfe041 - Use ImageNet normalization values

6be2654c - Use tensors in Resize transformations

ae1673af - Normalize test images

fe9a0876 - Fix prediction tests

Compare with previous version
Toggle commit list
Mélodie Boillet added 1 commit 1 year ago
added 1 commit

85579a21 - Fix linting

Compare with previous version
Mélodie Boillet added 1 commit 1 year ago
added 1 commit

4a55ead3 - Remove imgaug from requirements

Compare with previous version
Mélodie Boillet added 11 commits 1 year ago
added 11 commits

78e6f601 - 1 commit from branch main

3a252197 - Preprocess with torchvision transforms

b685b84d - Use imgaug for data augmentation

a92193a3 - Simlify PIL/numpy conversion

eac6c5d9 - Reimplement MaxResize to resize only big images

5138e979 - Use ImageNet normalization values

c9cb5326 - Use tensors in Resize transformations

7c320449 - Normalize test images

b7a4cf74 - Fix prediction tests

96666030 - Fix linting

a1ca620d - Remove imgaug from requirements

Compare with previous version
Toggle commit list
Mélodie Boillet added 1 commit 1 year ago
added 1 commit

cb6e700f - Preprocess image during image loading

Compare with previous version
Mélodie Boillet added 1 commit 1 year ago
added 1 commit

7baf9078 - Normalize test images

Compare with previous version
Mélodie Boillet added 1 commit 1 year ago
added 1 commit

4af11130 - Preprocess image during image loading

Compare with previous version
Mélodie Boillet added 1 commit 1 year ago
added 1 commit

2b1e4ad3 - Fix training tests

Compare with previous version
Mélodie Boillet changed the description 1 year ago

changed the description
Mélodie Boillet added 1 commit 1 year ago
added 1 commit

43982202 - Update documentation and fix some typos

Compare with previous version
Mélodie Boillet changed the description 1 year ago

changed the description
Mélodie Boillet marked this merge request as ready 1 year ago

marked this merge request as ready
Mélodie Boillet changed title from Use default mean std to Use ImageNet mean and std values 1 year ago

changed title from Use default mean std to Use ImageNet mean and std values
Mélodie Boillet mentioned in issue #110 (closed) 1 year ago

mentioned in issue #110 (closed)
Mélodie Boillet requested review from @schneider-y 1 year ago

requested review from @schneider-y
Yoann Schneider approved this merge request 1 year ago

approved this merge request
Yoann Schneider @yschneider · 1 year ago

Maintainer

Great work!
Yoann Schneider merged 1 year ago

merged
Mélodie Boillet mentioned in issue #117 (closed) 1 year ago

mentioned in issue #117 (closed)
Mélodie Boillet mentioned in issue #122 (closed) 1 year ago

mentioned in issue #122 (closed)
Mélodie Boillet mentioned in issue #124 (closed) 1 year ago

mentioned in issue #124 (closed)

Please register or sign in to reply