# Data augmentation transforms This page lists data augmentation transforms used in DAN. ## Individual augmentation transforms ### Elastic Transform | | Elastic Transform | | ---------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Description | This transformation applies local distortions that rotate characters locally. | | Comments | The impact of this transformation is mostly visible on documents, not so much on lines. Results are comparable to the original DAN implementation. | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.ElasticTransform) | | Examples |   | | CPU time (seconds/10 images) | 0.44 (3013x128 pixels) / 0.86 (1116x581 pixels) | ### PieceWise Affine !!! warning This transform is temporarily removed from the pipeline until [this issue](https://github.com/albumentations-team/albumentations/issues/1442) is fixed. | | PieceWise Affine | | ---------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Description | This transformation also applies local distortions but with a larger grid than ElasticTransform. | | Comments | This transformation is very slow. It is a new transform that was not in the original DAN implementation. | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.PiecewiseAffine) | | Examples |   | | CPU time (seconds/10 images) | 2.92 (3013x128 pixels) / 3.76 (1116x581 pixels) | ### Dilation Erosion | | Dilation & Erosion | | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | Description | This transformation makes the pen stroke thicker or thinner. | | Comments | The `RandomDilationErosion` class randomly selects a kernel size and applies a dilation or an erosion to the image. It relies on opencv and is similar to the original DAN implementation. | | Documentation | See the [`opencv` documentation](https://docs.opencv.org/3.4/db/df6/tutorial_erosion_dilatation.html) | | Examples |   | | CPU time (seconds/10 images) | 0.02 (3013x128 pixels) / 0.03 (1116x581 pixels) | ### Sharpen | | Sharpen | | ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Description | This transformation makes the image sharper. | | Comments | Similar to the original DAN implementation. | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Sharpen) | | Examples |   | | CPU time (seconds/10 images) | 0.02 (3013x128 pixels) / 0.04 (1116x581 pixels) | ### Color Jittering | | Color Jittering | | ---------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Description | This transformation alters the colors of the image. | | Comments | Similar to the original DAN implementation. | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ColorJitter) | | Examples |   | | CPU time (seconds/10 images) | 0.03 (3013x128 pixels) / 0.04 (1116x581 pixels) | ### Gaussian Noise | | Gaussian Noise | | ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Description | This transformation adds Gaussian noise to the image. | | Comments | The noise from the original DAN implementation is more uniform. | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/transforms/#albumentations.augmentations.transforms.GaussianNoise) | | Examples |   | | CPU time (seconds/10 images) | 0.29 (3013x128 pixels) / 0.53 (1116x581 pixels) | ### Gaussian Blur | | Gaussian Blur | | ---------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Description | This transformation blurs the image. | | Comments | Similar to the original DAN implementation. | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/transforms/#albumentations.augmentations.transforms.GaussianBlur) | | Examples |   | | CPU time (seconds/10 images) | 0.01 (3013x128 pixels) / 0.02 (1116x581 pixels) | ### Random Perspective | | Random Perspective | | ---------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Description | This transformation changes the perspective from which the photo is taken. | | Comments | Similar to the original DAN implementation. | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Perspective) | | Examples |   | | CPU time (seconds/10 images) | 0.05 (3013x128 pixels) / 0.05 (1116x581 pixels) | ### Shearing (x-axis) | | Shearing (x-axis) | | ---------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Description | This transformation changes the slant of the text on the image. | | Comments | New transform that was not in the original DAN implementation. | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.Affine) | | Examples |   | | CPU time (seconds/10 images) | 0.05 (3013x128 pixels) / 0.04 (1116x581 pixels) | ### Coarse Dropout | | Coarse Dropout | | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | Description | This transformation adds dropout on the image, turning small patches into black pixels. | | Comments | It is a new transform that was not in the original DAN implementation. | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/dropout/coarse_dropout/#coarsedropout-augmentation-augmentationsdropoutcoarse_dropout) | | Examples |   | | CPU time (seconds/10 images) | 0.02 (3013x128 pixels) / 0.02 (1116x581 pixels) | ### Random Scale | | RandomScale | | ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Description | This transformation downscales the image from a random factor. | | Comments | The original DAN implementation reimplemented it as [DPIAdjusting](https://github.com/FactoDeepLearning/DAN/blob/da3046a1cc83e9be3e54dd31a5e74d6134d1ebdc/basic/transforms.py#L62). | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.RandomScale) | | Examples |   | ### To Gray | | ToGray | | ---------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Description | This transformation transforms an RGB image into grayscale. | | Comments | It is a new transform that was not in the original DAN implementation. | | Documentation | See the [`albumentations` documentation](https://albumentations.ai/docs/api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ToGray) | | Examples |   | | CPU time (seconds/10 images) | 0.02 (3013x128 pixels) / 0.02 (1116x581 pixels) | ## Full augmentation pipeline - Data augmentation is applied with a probability of 0.9. - In this case, two transformations are randomly selected to be applied. - Reproducibility is possible by setting `random.seed` and `np.random.seed` (already done in `dan/ocr/document/train.py`) - Examples with new pipeline:   