Similarity Pyramid: Browsing A Document Database With Respect To Visual Similarity

IMAGING AND PRINTING IN A WEB 2.0 WORLD III(2012)

引用 0|浏览9
暂无评分
摘要
Managing large document databases has become an important task. Sorting document with respect to their visual similarity and layout features, and visualization of the whole document database is a desirable application. A user may with to search for documents in a database that are similar to a query in temrs of their stylistic features, or he/she may want to browse the whole database. In these tasks, clustering similar documents and organizing the document database with respect to the clusters is preferable to presenting documents in a random similarity pyramid. The pyramid is constructed from a stack of documents in a 3-D hierarchical structure called a similarity pyramid. The pyramid is constructed from a stack of document database embeddings on a 2-D surface with the help of a nonlinear dimensionality reduction algorithm called Isomap. The mapping algorithm preserves similarity distances between documents by mapping documents that are close to each other in a feature space to image icons that represent a large group of roughly similar documents, whereas lower levels contain document image icons representing small groups of very similar documents. A user can browse the database by moving along a certain level of a pyramid by moving between different levels.
更多
查看译文
关键词
Document database management,document visual similarity,similarity pyramid,Isomap
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要