Deep Web Data Integration System Based on Scrapy Framework

2022 2nd Asia Conference on Information Engineering (ACIE)(2022)

引用 0|浏览0
暂无评分
摘要
At present, most of the data on the World Wide Web is hidden in the database behind the web form that the traditional search engine cannot reach. Users can only use the query interface provided by the site to access these deep web data. Different sites often have different query interfaces for users. Therefore, the data integration of deep web is largely the integration of query interfaces. This paper is aimed at the online bookstore field using Python language and Scrapy framework to design and implement the crawling algorithms of multiple bookstores which is successfully applied to the deep web data integration system. The system's running shows: the user only needs to input the keyword query in the unified query interface of the system, so that the integrated bookstore can be searched, the book information is obtained, and the search results are automatically sorted according to the price. The use of this system not only saves the time for users for switching between websites, but also provides users with cost-effective services.
更多
查看译文
关键词
Deep Web,Data Integration,Query Interface,Web Crawler
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要