Research on Text Information Mining Technology of Substation Inspection Based on Improved Jieba

2021 International Conference on Wireless Communications and Smart Grid (ICWCSG)(2021)

引用 8|浏览4
暂无评分
摘要
With the development of smart grid, the power system has accumulated a lot of data. The inspection records of substations are mostly manual records. These data are recorded and stored in the power database in the form of text, which is difficult to use. Aiming at the problem that general word segmentation technologies have poor performance in power text recognition, this paper proposes to use TF-IDF algorithm to improve general Jieba word segmentation technology. The TF-IDF algorithm is used to identify and weight the power feature words, and update the data with higher weights to the keyword list, and more important words are retained. This article realizes the effective word segmentation of the text of the substation inspection record. Through comparative experiments with traditional techniques, segmentation technology that improves the accuracy and professionalism.
更多
查看译文
关键词
text processing,substation inspection record,Jieba word segmentation,TF-TDF algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要