Improving Language Model Pretraining with Text Structure Information

ICLR 2023(2023)

引用 0|浏览22
暂无评分
摘要
Inter-sentence pretraining tasks learn from sentence relationships and facilitate high-level language understanding that cannot be directly learned in word-level pretraining tasks. However, we have found experimentally that existing inter-sentence methods for general-purpose language pretraining improve performance only at a relatively small scale but not at larger scales. For an alternative, we propose Text Structure Prediction (TSP), a more sophisticated inter-sentence task that uses text structure to provide more abundant self-supervised learning signals to pretraining models at larger scales. TSP classifies sentence pairs over six designed text structure relationships and it can be seen as an implicit form of learning high-level language understanding by identifying key concepts and relationships in texts. Experiments show that TSP provides improved performance on language understanding tasks for models at various scales. Our approach thus serves as an initial attempt to demonstrate that the exploitation of text structure can facilitate language understanding.
更多
查看译文
关键词
Language Model Pretraining,Representation Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要