Daniel@FinTOC’2 Shared Task: Title Detection and Structure Extraction

international conference on computational linguistics(2020)

引用 2|浏览1
暂无评分
摘要
We present our contributions for the two tracks of the 2020 FinTOC Shared Tasks: Table of Content (ToC) extraction in English documents and French documents. We describe separately our work on Title Detection and ToC Extraction. For ToC Extraction, we propose an approach that combines information from multiple sources: the table of contents, the wording of the document, and lexical domain knowledge. For the title detection part, we compare surface features to character-based features on various training configurations. We show that title detection results are very sensitive to the kind of training dataset used.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要