# Training workflow

There are a several steps to follow when training a DAN model.

## 1. Extract data

The data must be extracted and formatted for training. To extract the data, DAN uses an Arkindex export database in SQLite format. You will need to:

1. Structure the data into folders (`train` / `val` / `test`) in [Arkindex](https://arkindex.teklia.com/).
2. [Export the project](https://doc.arkindex.org/howto/export/) in SQLite format.
3. Extract the data with the [extract command](../usage/datasets/extract.md).
4. Format the data with the [format command](../usage/datasets/format.md).

At the end, you should have a tree structure like this:
```
output/
├── charset.pkl
├── labels.json
├── split.json
├── images
│ ├── train
│ ├── val
│ └── test
└── labels
├── train
├── val
└── test
```

## 2. Train

The training command does not take any input parameters for now. To train a DAN model, you will therefore need to:

1. Update the parameters from those listed in the [dedicated page](../usage/train/parameters.md). You will always need to update at least these variables:

- `dataset_name`, `dataset_level`, `dataset_variant` and `dataset_path`,
- `model_params.transfer_learning.*[checkpoint_path]` to finetune an existing model,
- `training_params.output_folder`.

2. Train a DAN model with the [train command](../usage/train/index.md).

## 3. Predict

Once the training is complete, you can apply a trained DAN model on an image.

To do this, you will need to:

1. Create a `parameters.yml` file using the parameters saved during training in the `params` file, located in `{training_params.output_folder}/results`. This file should have the following format:
```yml
version: 0.0.1
parameters:
max_char_prediction: int
encoder:
input_channels: int
dropout: float
decoder:
enc_dim: int
l_max: int
dec_pred_dropout: float
attention_win: int
vocab_size: int
h_max: int
w_max: int
dec_num_layers: int
dec_dim_feedforward: int
dec_num_heads: int
dec_att_dropout: float
dec_res_dropout: float
preprocessings:
- type: str
max_height: int
max_width: int
fixed_height: int
fixed_width: int
```
2. Apply a trained DAN model on an image using the [predict command](../usage/predict.md).