BPE-Dropout: Simple and Effective Subword Regularization

ACL, pp. 1882-1892, 2020.

被引用9|浏览81
EI
微博一下
We introduce Byte Pair Encoding-dropout – simple and effective subword regularization, which operates within the standard Byte Pair Encoding framework

摘要

Subword segmentation is widely used to address the open vocabulary problem in machine translation. The dominant approach to subword segmentation is Byte Pair Encoding (BPE), which keeps the most frequent words intact while splitting the rare ones into multiple tokens. While multiple segmentations are possible even with the same vocabula...更多

代码

数据

0
下载 PDF 全文
引用
微博一下
您的评分 :
0

 

标签
评论