Cross-collection Dataset of Public Domain Portuguese-language Works.

Journal of Information and Data Management(2022)

引用 0|浏览7
暂无评分
摘要
Many datasets are published in English to get more engagement, popularity and reach within a research community. Indeed, most sciences are language-agnostic and thrive on publicly available data. However, such a claim is not always valid for Arts, where Literature and Music are two examples of fields that heavily rely on the language of the work. Especially in Literature, combining human expertise with book consumers’ data may generate what is needed to sustain constant changes experienced in the book publishing market. Therefore, we introduce PPORTAL, the first public domain Portuguese-language literature dataset that is composed of a wide variety of book-related metadata. Afterintroducing its building process and content, we present an exploratory data analysis with a quantitative description of its main features. We also show its usability as a resource on different research domains through examples of real-world applications, as well as pointing out other potential applications.
更多
查看译文
关键词
cross-collection,portuguese-language
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要