Traitements Automatiques pour la Migration de Documents Numériques vers XML

Document Numérique(2006)

引用 0|浏览3
暂无评分
摘要
More and more companies are migrating their legacy document management sys- tems toward XML format, the industrial standard for data exchange. In order to reduce the migration cost we propose an approach aimed at automating the conversion of layout-oriented documents to semantic-oriented annotations. The conversion module uses supervised machine learning techniques to learn a conversion model for a collection of documents. The conver- sion is achieved through a semantic annotation of the document content and structuring the annotations, accordingly to a XML schema that specify the class of target documents.
更多
查看译文
关键词
xml.,extraction d'informations,xml. keywords:machine learning,mots-clés :apprentissage supervisé,information extraction,document management,machine learning,xml schema,data exchange
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要