Something went wrong on our end
augmentation.md 17.92 KiB
Data augmentation transforms
This page lists data augmentation transforms used in DAN.
Individual augmentation transforms
Elastic Transform
Elastic Transform | |
---|---|
Description | This transformation applies local distortions that rotate characters locally. |
Comments | The impact of this transformation is mostly visible on documents, not so much on lines. Results are comparable to the original DAN implementation. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 0.44 (3013x128 pixels) / 0.86 (1116x581 pixels) |
PieceWise Affine
!!! warning This transform is temporarily removed from the pipeline until this issue is fixed.
PieceWise Affine | |
---|---|
Description | This transformation also applies local distortions but with a larger grid than ElasticTransform. |
Comments | This transformation is very slow. It is a new transform that was not in the original DAN implementation. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 2.92 (3013x128 pixels) / 3.76 (1116x581 pixels) |
Dilation Erosion
Dilation & Erosion | |
---|---|
Description | This transformation makes the pen stroke thicker or thinner. |
Comments | The RandomDilationErosion class randomly selects a kernel size and applies a dilation or an erosion to the image. It relies on opencv and is similar to the original DAN implementation. |
Documentation | See the opencv documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 0.02 (3013x128 pixels) / 0.03 (1116x581 pixels) |
Sharpen
Sharpen | |
---|---|
Description | This transformation makes the image sharper. |
Comments | Similar to the original DAN implementation. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 0.02 (3013x128 pixels) / 0.04 (1116x581 pixels) |
Color Jittering
Color Jittering | |
---|---|
Description | This transformation alters the colors of the image. |
Comments | Similar to the original DAN implementation. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 0.03 (3013x128 pixels) / 0.04 (1116x581 pixels) |
Gaussian Noise
Gaussian Noise | |
---|---|
Description | This transformation adds Gaussian noise to the image. |
Comments | The noise from the original DAN implementation is more uniform. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 0.29 (3013x128 pixels) / 0.53 (1116x581 pixels) |
Gaussian Blur
Gaussian Blur | |
---|---|
Description | This transformation blurs the image. |
Comments | Similar to the original DAN implementation. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 0.01 (3013x128 pixels) / 0.02 (1116x581 pixels) |
Random Perspective
Random Perspective | |
---|---|
Description | This transformation changes the perspective from which the photo is taken. |
Comments | Similar to the original DAN implementation. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 0.05 (3013x128 pixels) / 0.05 (1116x581 pixels) |
Shearing (x-axis)
Shearing (x-axis) | |
---|---|
Description | This transformation changes the slant of the text on the image. |
Comments | New transform that was not in the original DAN implementation. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 0.05 (3013x128 pixels) / 0.04 (1116x581 pixels) |
Coarse Dropout
Coarse Dropout | |
---|---|
Description | This transformation adds dropout on the image, turning small patches into black pixels. |
Comments | It is a new transform that was not in the original DAN implementation. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 0.02 (3013x128 pixels) / 0.02 (1116x581 pixels) |
Random Scale
RandomScale | |
---|---|
Description | This transformation downscales the image from a random factor. |
Comments | The original DAN implementation reimplemented it as DPIAdjusting. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
To Gray
ToGray | |
---|---|
Description | This transformation transforms an RGB image into grayscale. |
Comments | It is a new transform that was not in the original DAN implementation. |
Documentation | See the albumentations documentation
|
Examples |
![]() ![]() |
CPU time (seconds/10 images) | 0.02 (3013x128 pixels) / 0.02 (1116x581 pixels) |
Full augmentation pipeline
- Data augmentation is applied with a probability of 0.9.
- In this case, two transformations are randomly selected to be applied.
- Reproducibility is possible by setting
random.seed
andnp.random.seed
(already done indan/ocr/document/train.py
) - Examples with new pipeline: