augmentation.md



Data augmentation transforms
This page lists data augmentation transforms used in DAN.

Individual augmentation transforms

Elastic Transform


Elastic Transform


Description
This transformation applies local distortions that rotate characters locally.


Comments
The impact of this transformation is mostly visible on documents, not so much on lines. Results are comparable to the original DAN implementation.


Documentation
See the albumentations documentation


Examples

 
CPU time (seconds/10 images)
0.44 (3013x128 pixels) / 0.86 (1116x581 pixels)


PieceWise Affine
!!! warning
This transform is temporarily removed from the pipeline until this issue is fixed.


PieceWise Affine


Description
This transformation also applies local distortions but with a larger grid than ElasticTransform.


Comments
This transformation is very slow. It is a new transform that was not in the original DAN implementation.


Documentation
See the albumentations documentation


Examples

 
CPU time (seconds/10 images)
2.92 (3013x128 pixels) / 3.76 (1116x581 pixels)


Dilation Erosion


Dilation & Erosion


Description
This transformation makes the pen stroke thicker or thinner.


Comments
The RandomDilationErosion class randomly selects a kernel size and applies a dilation or an erosion to the image. It relies on opencv and is similar to the original DAN implementation.


Documentation
See the opencv documentation


Examples

 
CPU time (seconds/10 images)
0.02 (3013x128 pixels) / 0.03 (1116x581 pixels)


Sharpen


Sharpen


Description
This transformation makes the image sharper.


Comments
Similar to the original DAN implementation.


Documentation
See the albumentations documentation


Examples

 
CPU time (seconds/10 images)
0.02 (3013x128 pixels) / 0.04 (1116x581 pixels)


Color Jittering


Color Jittering


Description
This transformation alters the colors of the image.


Comments
Similar to the original DAN implementation.


Documentation
See the albumentations documentation


Examples

 
CPU time (seconds/10 images)
0.03 (3013x128 pixels) / 0.04 (1116x581 pixels)


Gaussian Noise


Gaussian Noise


Description
This transformation adds Gaussian noise to the image.


Comments
The noise from the original DAN implementation is more uniform.


Documentation
See the albumentations documentation


Examples

 
CPU time (seconds/10 images)
0.29 (3013x128 pixels) / 0.53 (1116x581 pixels)


Gaussian Blur


Gaussian Blur


Description
This transformation blurs the image.


Comments
Similar to the original DAN implementation.


Documentation
See the albumentations documentation


Examples

 
CPU time (seconds/10 images)
0.01 (3013x128 pixels) / 0.02 (1116x581 pixels)


Random Perspective


Random Perspective


Description
This transformation changes the perspective from which the photo is taken.


Comments
Similar to the original DAN implementation.


Documentation
See the albumentations documentation


Examples

 
CPU time (seconds/10 images)
0.05 (3013x128 pixels) / 0.05 (1116x581 pixels)


Shearing (x-axis)


Shearing (x-axis)


Description
This transformation changes the slant of the text on the image.


Comments
New transform that was not in the original DAN implementation.


Documentation
See the albumentations documentation


Examples

 
CPU time (seconds/10 images)
0.05 (3013x128 pixels) / 0.04 (1116x581 pixels)


Coarse Dropout


Coarse Dropout


Description
This transformation adds dropout on the image, turning small patches into black pixels.


Comments
It is a new transform that was not in the original DAN implementation.


Documentation
See the albumentations documentation


Examples

 
CPU time (seconds/10 images)
0.02 (3013x128 pixels) / 0.02 (1116x581 pixels)


Random Scale


RandomScale


Description
This transformation downscales the image from a random factor.


Comments
The original DAN implementation reimplemented it as DPIAdjusting.


Documentation
See the albumentations documentation


Examples

 
To Gray


ToGray


Description
This transformation transforms an RGB image into grayscale.


Comments
It is a new transform that was not in the original DAN implementation.


Documentation
See the albumentations documentation


Examples

 
CPU time (seconds/10 images)
0.02 (3013x128 pixels) / 0.02 (1116x581 pixels)


Full augmentation pipeline

Data augmentation is applied with a probability of 0.9.
In this case, two transformations are randomly selected to be applied.
Reproducibility is possible by setting random.seed and np.random.seed (already done in dan/ocr/document/train.py)
Examples with new pipeline: