Analyzing the Structure of U.S. Patents Using Patent Families

2022 12th International Congress on Advanced Applied Informatics (IIAI-AAI)(2022)

引用 0|浏览1
暂无评分
摘要
Researchers and developers search for patents in fields related to their own research to obtain information on issues and effective technologies in those fields for use in their research. However, it is impossible to read through the full text of many patents, so a method that enables patent information to be grasped briefly is needed. In this study, we analyze the structure of U.S. patents with the aim of extracting important information. Using Japanese patents with structural tags such as "field", "problem", "solution", and "effect", and corresponding U.S. patents (patent families), we automatically created a dataset of 81,405 U.S. patents with structural tags. Furthermore, using this dataset, we conduct an experiment to assign structural tags to each sentence in the U. S. patents automatically. For the embedding layer, we use a language representation model, Bidirectional Encoder Representations from Transformer, pretrained on patent documents and construct a multi-label classifier that classifies a given sentence into one of four categories: "field", "problem", "solution", or "effect". Using a loss function that considers the unbalanced amount of data for each structural tag, we are able to classify sentences related to "field", "problem", "solution", and "effect" with precision of 0.6994, recall of 0.8291, and F-measure of 0.7426.
更多
查看译文
关键词
patent,document structure analysis,machine translation,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要