Initiate training as a new script

Create a new script (not a worker) in worker_pylaia/train.py that will create a new model.

Arguments:

--syms-path, pathlib.Path, required, path to the mapping from strings to integers (related to the training dataset)

Define the following functions:

get_model_creation_config(): this will
- create a tmpdir where the model will be saved, suffixed by -train (just like U-FCN)
- return a JSON version of the following YAML file (inspired by the original creation YAML config file)

common:
  train_path: <tmpdir_created>
adaptive_pooling: avgpool-16
crnn:
  cnn_activation:
  - LeakyReLU
  - LeakyReLU
  - LeakyReLU
  - LeakyReLU
  cnn_batchnorm:
  - true
  - true
  - true
  - true
  cnn_dilation:
  - 1
  - 1
  - 1
  - 1
  cnn_kernel_size:
  - 3
  - 3
  - 3
  - 3
  cnn_num_features:
  - 12
  - 24
  - 48
  - 48
  cnn_poolsize:
  - 2
  - 2
  - 0
  - 2
  lin_dropout: 0.5
  rnn_dropout: 0.5
  rnn_layers: 3
  rnn_type: LSTM
  rnn_units: 256
fixed_input_height: 128
save_model: true
syms: <args.syms_path>

create_model(): this will make a call to laia.scripts.htr.create_model.run (renamed as create_model) defined here with the following arguments

run(
    syms=config["syms"],
    fixed_input_height=config["fixed_input_height"],
    adaptive_pooling=config["adaptive_pooling"],
    common=CommonArgs(**config["common"]),
    crnn=CreateCRNNArgs(**config["crnn"]),
    save_model=config['save_model']
)

To test your script, you can use this syms file and check that the architecture of the created model is the same as this one's.

Running the script will call first get_model_creation_config then create_model and log the path to the created model as well as print its details.

Edited Sep 06, 2022 by Yoann Schneider

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information