TNA Fellow Anna Mędrzecka-Stefańska

Polish poetry corpus: creation, integration with PoeTree, computational
poetics

The project aims to enhance and integrate a comprehensive corpus of Polish poetry into the multilingual PoeTree corpus, facilitating advanced computational studies in poetics. The primary objectives are threefold: first, to expand the annotated corpus of Polish poetry in collaboration with CLARIN-PL; second, to integrate this corpus seamlessly into PoeTree—which currently includes over 330,000 poems across ten languages but lacks Polish poetry; and third, to conduct pilot studies on versology using the integrated corpus.

The project involves creating a balanced dataset from various sources, annotating and enriching the texts, and conducting preliminary analyses of rhyme and verse patterns. The ultimate goal is to establish a resource that supports large-scale comparative and computational research, contributing significantly to Polish poetic studies and the broader field of digital literary scholarship.