Text Recognition Model for Yiddish in Vaybertaytsh Typeface, Based on Community Regulations

Ronny Reshef*, Mirjam Gutschow

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

6 Downloads (Pure)

Abstract

We present a public text recognition PyLaia model accompanied by a baseline model for the layout of community regulations in Yiddish and a dataset for Yiddish texts printed in Vaybertaytsh typeface. The model was built using legal documents, namely regulations written by the Ashkenazi Jewish community in Amsterdam during the 18th century. The necessity of such a model for Vaybertaytsh typeface stems from the substantial differences between it and other Yiddish or Hebrew typefaces. Existing text recognition models for Yiddish are dedicated to handwritten texts or substantially other typefaces, followed by a short description of the dataset, its unique characteristics, and how it can be used further. The process of training the text recognition model is explained, and challenges encountered are specified, as well as strategies for coping with them. The model is publicly accessible via Transkribus, and the complete dataset used to train the model is available via Figshare. The models and dataset offer valuable contributions to the digital humanities, specifically for research on linguistics, Jewish History and related fields.
Original languageEnglish
Article number35
Pages (from-to)1-10
Number of pages10
JournalJournal of Open Humanities Data
Volume10
DOIs
Publication statusPublished - 6 May 2024

Bibliographical note

Publisher Copyright: © 2024 The Author(s).

Research programs

  • RSM ORG

Fingerprint

Dive into the research topics of 'Text Recognition Model for Yiddish in Vaybertaytsh Typeface, Based on Community Regulations'. Together they form a unique fingerprint.

Cite this