Legal Textual Entailment Using Ensemble of Rule-Based and BERT-Based Method with Data Augmentation by Related Article Generation

Masaki Fujita, Takaaki Onaga,Ayaka Ueyama,Yoshinobu Kano

New Frontiers in Artificial Intelligence(2023)

引用 0|浏览5
暂无评分
摘要
We report our system architecture of COLIEE 2022 Task 4, which challenges to solve the textual entailment part of the Japanese legal bar examination problems. We successfully improved the correct answer ratio by an ensemble of a rule-based method and BERT-based method. Our proposed methods mainly consist of two parts: data augmentations of training dataset and an ensemble of the methods. Regarding training data augmentation, the civil law articles are segmented once and reconstructed again with all the combinations. Data expansion is then performed by replacing the data with negative forms and alphabetical symbols. Focusing on the characteristics that the rule-based method is high in its precision but low in its coverage, we employed a modular way in our ensemble. We integrated other proposed methods such as Sentence-BERT to select necessary data, person name inference to replace alphabetical anonymized symbols with the actual role name of the person. We confirmed that our suggested methods are effective by comparing with our baseline models, achieved 0.6789 correct answer ratio in accuracy on the formal run test dataset, which was the best score among the COLIEE 2022 Task 4 submissions.
更多
查看译文
关键词
rule-based,bert-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要