Nonuniform Language In Technical Writing: Detection And Correction

NATURAL LANGUAGE ENGINEERING(2021)

引用 1|浏览17
暂无评分
摘要
Technical writing in professional environments, such as user manual authoring, requires the use of uniform language. Nonuniform language refers to sentences in a technical document that are intended to have the same meaning within a similar context, but use different words or writing style. Addressing this nonuniformity problem requires the performance of two tasks. The first task, which we named nonuniform language detection (NLD), is detecting such sentences. We propose an NLD method that utilizes different similarity algorithms at lexical, syntactic, semantic and pragmatic levels. Different features are extracted and integrated by applying a machine learning classification method. The second task, which we named nonuniform language correction (NLC), is deciding which sentence among the detected ones is more appropriate for that context. To address this problem, we propose an NLC method that combines contraction removal, near-synonym choice, and text readability comparison. We tested our methods using smartphone user manuals. We finally compared our methods against state-of-the-art methods in paraphrase detection (for NLD) and against expert annotators (for both NLD and NLC). The experiments demonstrate that the proposed methods achieve performance that matches expert annotators.
更多
查看译文
关键词
Technical language, Text similarity, Text simplification, Semantic Similarity, Sentence similarity, Paraphrase detection, Text error detection, Text error correction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要