Latviešu valodas senāko rakstu pieminekļu konvertācija mūsdienu rakstībā: iepriekšējā pieredze un automatizācijas mēģinājumi

Aktuālās problēmas literatūras un kultūras pētniecībā rakstu krājums(2022)

引用 0|浏览0
暂无评分
摘要
The aim of the article is to describe in brief previous experiences with preparing early written Latvian texts for publication in recent times; we review the first attempts carried out by a working group dealing with the modernization of the Corpus of early written Latvian, examine the process and results of the conversion of early texts into modern spelling, and discuss the terminology associated with this conversion. The topic of this article is early written Latvian texts (16th–18th cc.), modernization of the Corpus of early written Latvian and conversion of early written texts into modern spelling. By means of descriptive analytics, conversion rules are developed. The process of conversion currently in force comprises the following steps: creation of conversion tables; application of the tables in a software algorithm; automated conversion; post-editing; re-reading and detection of mistakes in the converted source (the whole text or part of it if the text is large); analysis of mistakes and supplementation or correction of the conversion tables estimating the usefulness of corrections; repeated automated conversion; quality estimation. 25 conversion tables have been created till now. The process of conversion covers unambiguous, positional, and individual correspondences, which are detected for every single text source. The number of correspondence rules varies from 37 for The Lord’s Prayer by Bruno (1520) to 897 rules for ‘Undeudsche Psalmen’ (1587). In light of the development of Latvian spelling, the number of rules covering the conversion over time gradually decreases because the number of individual rules goes down as the spelling becomes more universal. Due to the fact that the orthography of the 18th c. texts is close to the spelling dominating at the end of the 17th c., we hope to create a conversion pattern that with minor variation could be applied for the bulk of the sources from 18th c.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要