Prosta Mova HTR Model (Puigcerver CRNN)
A Handwritten Text Recognition (HTR) model for Prosta Mova — the Ruthenian language used in Ukrainian and Belarusian documents from the 16th to 18th centuries. Trained on early printed books from the Ostroh Printery and Academy; suitable for Ukrainian Church Slavonic texts as well as texts written in Prosta Mova. Based on the CNN + BiLSTM + CTC architecture introduced in Puigcerver (2017) and used as the backbone of PyLaia and Transkribus.
This is a clean-room PyTorch reimplementation of that published architecture (PyLaia-inspired). It does not use the PyLaia Python package and is not loadable by it — training and inference run via plain PyTorch (see Usage below).
Model Details
- Architecture: CNN encoder [12, 24, 48, 48 filters] + 3-layer Bidirectional LSTM (256 units) + CTC decoder (Puigcerver 2017)
- Input: Grayscale line images, normalized to 128 px height with aspect ratio preserved
- Output: UTF-8 text (East/Ukrainian Church Slavonic and Prosta Mova Cyrillic with diacritical marks)
- Vocabulary: 186 symbols (
symbols.txt), including punctuation and combining diacritics - Framework: Pure PyTorch — clean-room reimplementation of the Puigcerver (2017) architecture (PyLaia-inspired); the PyLaia package is not required
Performance
| Metric | Value |
|---|---|
| Validation CER | 3.77% |
| Training epochs | 97 |
| Training lines | 58,843 |
| Training pages | 948 |
| Validation lines | 2,588 |
| Validation pages | 54 |
Training Data
Trained on images of early printed books transcribed and exported from Transkribus (see the corresponding Transkribus model page). The dataset covers Church Slavonic Ruthenian printings from the Ostroh Printery and Academy (end of the 16th to beginning of the 17th centuries).
Source texts:
- Ostroh Bible (1581)
- Kniga o postničestvě (1594)
- Margarit (1595)
- Otpis na list v boze velebnogo otca Ipatija volodimirs'kago i berestejskogo episkopa (1598)
- Apokrisis (1598–99)
- Pravilo istinnago života christianskogo (Psaltir z vozsliduvannjam) (1598)
The Transkribus training collection comprises 962 pages and 59,990 lines. Our CRNN-CTC model was trained on a corresponding export: 58,843 training lines (948 pages) and 2,588 validation lines (54 pages). The dataset was carefully preprocessed to correct EXIF rotation artifacts; aspect ratio preservation was applied to maintain character resolution.
The Transkribus model was created by Martin Meindl as part of the Continslav project, building on a generic East Church Slavonic printings model by Achim Rabus. Training data was prepared by Uliana Shtandenko and Alexandre Trébuchon. Model curated by Achim Rabus (Slavic Department, University of Freiburg).
Usage
Requirements
pip install torch torchvision pillow
Inference
Download best_model.pt, symbols.txt, and model_config.json from this repository,
then use the inference script from polyscriptor:
from inference_pylaia_native import PyLaiaInference
from PIL import Image
# Load model
model = PyLaiaInference(
checkpoint_path="best_model.pt",
syms_path="symbols.txt"
)
# Transcribe a line image
image = Image.open("line_image.jpg")
text = model.transcribe(image)
print(text)
Note: Input should be a single text line image, not a full page. Preprocessing (grayscale conversion, height normalization, aspect ratio preservation) is handled automatically by
inference_pylaia_native.py.
For full-page inference with automatic line segmentation, use batch_processing.py:
python batch_processing.py \
--engine crnn-ctc \
--model-path best_model.pt \
--input-folder images/ \
--output-folder output/
GUI Usage
polyscriptor also ships graphical interfaces that handle full-page processing without requiring pre-segmented line images:
Interactive single-page GUI — loads raw page images, performs automatic line segmentation, and can export results as PAGE XML:
python transcription_gui_plugin.py
Batch processing GUI — processes entire folders; auto-detects existing PAGE XML files (e.g. from Transkribus) and uses them for segmentation when available:
python polyscriptor_batch_gui.py
Intended Use
- Transcription of Prosta Mova and (Ukrainian) Church Slavonic early printed books
- Ukrainian and Belarusian historical document digitization (16th–18th centuries)
- Digital humanities research on early modern East Slavic texts
Limitations
- Optimized for Ostroh Printery-style printings; may underperform on other sources
- Full-page segmentation quality depends on the segmentation method used upstream
Citation
If you use this model in your research, please cite the architecture paper and this model:
@article{puigcerver2017multidimensional,
title = {Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?},
author = {Puigcerver, Joan},
journal = {Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},
year = {2017},
url = {https://www.jpuigcerver.net/pubs/jpuigcerver_icdar2017.pdf}
}
@misc{rabus2026polyscriptor,
title = {Polyscriptor: Multi-Engine HTR Training \& Comparison Tool},
author = {Rabus, Achim},
year = {2026},
url = {https://github.com/achimrabus/polyscriptor}
}