# DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition ## Documentation For more details about this package, make sure to see the documentation available at https://teklia.gitlab.io/atr/dan/. ## Installation To use DAN in your own scripts, install it using pip: ```console pip install -e . ``` ## Inference To apply DAN to an image, one needs to first add a few imports and to load an image. Note that the image should be in RGB. ```python import cv2 from dan.predict import DAN image = cv2.cvtColor(cv2.imread(IMAGE_PATH), cv2.COLOR_BGR2RGB) ``` Then one can initialize and load the trained model with the parameters used during training. ```python model_path = 'model.pt' params_path = 'parameters.yml' charset_path = 'charset.pkl' model = DAN('cpu') model.load(model_path, params_path, charset_path, mode="eval") ``` To run the inference on a GPU, one can replace `cpu` by the name of the GPU. In the end, one can run the prediction: ```python text, confidence_scores = model.predict(image, confidences=True) ``` ## Training This package provides three subcommands. To get more information about any subcommand, use the `--help` option. ### Data extraction from Arkindex See the [dedicated section](https://teklia.gitlab.io/atr/dan/usage/datasets/extract/) on the official DAN documentation. ### Dataset formatting See the [dedicated section](https://teklia.gitlab.io/atr/dan/usage/datasets/format/) on the official DAN documentation. ### Model training See the [dedicated section](https://teklia.gitlab.io/atr/dan/usage/train/) on the official DAN documentation. ### Synthetic data generation See the [dedicated section](https://teklia.gitlab.io/atr/dan/usage/generate/) on the official DAN documentation.