BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild

IEEE Conference on Computer Vision and Pattern Recognition(2022)

引用 7|浏览29
暂无评分
摘要
As a prerequisite of many text-related tasks such as text erasing and text style transfer, text segmentation arouses more and more attention recently. Current researches mainly focus on only English characters and digits, while few work studies Chinese characters due to the lack of pub-lic large-scale and high-quality Chinese datasets, which limits the practical application scenarios of text segmentation. Different from English which has a limited alphabet of letters, Chinese has much more basic characters with com-plex structures, making the problem more difficult to deal with. To better analyze this problem, we propose the Bi-lingual Text Segmentation (BTS) dataset, a benchmark that covers various common Chinese scenes including 14,250 diverse and fine-annotated text images. BTS mainly focuses on Chinese characters, and also contains English words and digits. We also introduce Prior Guided Text Segmen-tation Network (PGTSNet), the first baseline to handle bi-lingual and complex-structured text segmentation. A plug-in text region highlighting module and a text perceptual dis-criminator are proposed in PGTSNet to supervise the model with text prior, and guide for more stable and finer text seg-mentation. A variation loss is also employed for suppressing background noise under complex scene. Extensive ex-periments are conducted not only to demonstrate the neces-sity and superiority of the proposed dataset BTS, but also to show the effectiveness of the proposed PGTSNet compared with a variety of state-of-the-art text segmentation methods.
更多
查看译文
关键词
Datasets and evaluation, Computer vision theory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要