-
Notifications
You must be signed in to change notification settings - Fork 43
train ctc
Solène Tarride edited this page Feb 15, 2023
·
2 revisions
To train your own model, you will need to follow three main steps:
- Prepare your dataset
- Initialize your model
- Train your model
- To train PyLaia, you will need line-level images and their corresponding transcription.
- All images should be resized to a fixed height. The recommended value is 128 pixels.
- Your images should be divided in three sets: train, validation and test.
For each split, 3 files are required:
-
train.txt
mapping images and transcription tokens. This file is used to train the model.
train/im01 f o r <space> d e t <space> t i l f æ l d e <space> d e t <space> s k u l d e <space> l y k k e s <space> D i g
train/im02 a t <space> o p d r i v e <space> d e t <space> o m s k r e v n e <space> e x p l : <space> a f
train/im03 « F r u <space> I n g e r » , <space> a t <space> s e n d e <space> m i g <space> s a m m e
-
train_ids.txt
listing image names. This file is used for prediction.
train/im01
train/im02
train/im03
-
train_eval.txt
mapping images and transcription. This file is used during evaluation.
train/im01 for det tilfælde det skulde lykkes Dig
train/im02 at opdrive det omskrevne expl: af
train/im03 «Fru Inger», at sende mig samme
Finally, you need to generate the alphabet mapping table, beginning by the <ctc>
token. You can find an example of the syms.txt
file on HuggingFace.
Use the following command to initialize the model.
Note that you need to provide the syms.txt
file for building the model, which is used to compute the size of the alphabet in order to initialize the last linear layer.
The model can be fully configured using a configuration file or command-line arguments.
- Run the following command to get the full list of command-line arguments
pylaia-htr-create-model --help
- Or use a YAML configuration file and run the following command:
adaptive_pooling: avgpool-16
crnn:
cnn_activation:
- LeakyReLU
- LeakyReLU
- LeakyReLU
- LeakyReLU
cnn_batchnorm:
- true
- true
- true
- true
cnn_dilation:
- 1
- 1
- 1
- 1
cnn_kernel_size:
- 3
- 3
- 3
- 3
cnn_num_features:
- 12
- 24
- 48
- 48
cnn_poolsize:
- 2
- 2
- 0
- 2
lin_dropout: 0.5
rnn_dropout: 0.5
rnn_layers: 3
rnn_type: LSTM
rnn_units: 256
fixed_input_height: 128
save_model: true
syms: syms.txt
pylaia-htr-create-model --config config_create_model.yaml 2>&1 | tee pylaia_create_model.log
Training can also be configured using a configuration file or command-line arguments.
- Run the following command to get the full list of command-line arguments
pylaia-htr-train-ctc --help
- Or create a YAML configuration file
config_train.yaml
:
syms: data/syms.txt
tr_txt_table: data/train.txt
va_txt_table: data/val.txt
common:
experiment_dirname: experiment
data:
batch_size: 8
color_mode: L
optimizer:
learning_rate: 0.0005
name: RMSProp
scheduler:
active: true
monitor: va_loss
train:
augment_training: true
early_stopping_patience: 80
trainer:
auto_select_gpus: true
gpus: 1
max_epochs: 600
- Train the model:
pylaia-htr-train-ctc --config config_train.yaml 2>&1 | tee pylaia_train.log