ADAGIO: Fast Data-Aware Near-Isometric Linear Embeddings

2016 IEEE 16th International Conference on Data Mining (ICDM)(2016)

引用 3|浏览48
暂无评分
摘要
Many important applications, including signal reconstruction, parameter estimation, and signal processing in a compressed domain, rely on a low-dimensional representation of the dataset that preserves all pairwise distances between the data points and leverages the inherent geometric structure that is typically present. Recently Hedge, Sankaranarayanan, Yin and Baraniuk [19] proposed the first data-aware near-isometric linear embedding which achieves the best of both worlds. However, their method NuMax does not scale to large-scale datasets. Our main contribution is a simple, data-aware, near-isometric linear dimensionality reduction method which significantly outperforms a state-of-the-art method [19] with respect to scalability while achieving high quality near-isometries. Furthermore, our method comes with strong worst-case theoretical guarantees that allow us to guarantee the quality of the obtained nearisometry. We verify experimentally the efficiency of our method on numerous real-world datasets, where we find that our method (<;10 secs) is more than 3000× faster than the state-of-the-art method [19] (>9 hours) on medium scale datasets with 60000 datapoints in 784 dimensions. Finally, we use our method as a preprocessing step to increase the computational efficiency of a classification application and for speeding up approximate nearest neighbor queries.
更多
查看译文
关键词
ADAGIO,fast data-aware near-isometric linear embeddings,signal reconstruction,parameter estimation,signal processing,low-dimensional representation,NuMax,approximate nearest neighbor queries,principal component analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要