AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
compressed linear algebra shows similar operations performance at significantly better compression ratios which is crucial for end-to-end performance improvements of large-scale machine learning

Compressed Linear Algebra for Large-Scale Machine Learning.

PVLDB, no. 12 (2018): 960-971

Cited by: 69|Views297
EI

Abstract

Large-scale machine learning (ML) algorithms are often iterative, using repeated read-only data access and I/O-bound matrix-vector multiplications to converge to an optimal model. It is crucial for performance to fit the data into single-node or distributed main memory. General-purpose, heavy- and lightweight compression techniques strugg...More

Code:

Data:

0
Introduction
  • Data has become a ubiquitous resource [16]. Large-scale machine learning (ML) leverages these large data collections in order to find interesting patterns and build robust predictive models [16, 19].
  • SystemML [21] aims at declarative ML [12], where algorithms are expressed in a high level scripting language having an R-like syntax and compiled to hybrid runtime plans that combine both single-node, in-memory operations and distributed operations on MapReduce or Spark [28].
  • Information about data size and sparsity are propagated from the inputs through the entire program to enable worst-case memory estimates per operation
  • These estimates are used during an operator-selection step, yielding a DAG of low-level operators, which is compiled into a runtime program of executable instructions
Highlights
  • Data has become a ubiquitous resource [16]
  • Declarative machine learning (ML): State-of-the-art, large-scale ML aims at declarative ML algorithms [12], expressed in high-level languages, which are often based on linear algebra, i.e., matrix multiplications, aggregations, element-wise and statistical operations
  • SystemML [21] aims at declarative ML [12], where algorithms are expressed in a high level scripting language having an R-like syntax and compiled to hybrid runtime plans that combine both single-node, in-memory operations and distributed operations on MapReduce or Spark [28]
  • compressed linear algebra (CLA) shows similar operations performance at significantly better compression ratios which is crucial for end-to-end performance improvements of large-scale ML
  • We have initiated work on compressed linear algebra (CLA), in which matrices are compressed with lightweight techniques and linear algebra operations are performed directly over the compressed representation
  • Our experiments show operations performance close to the uncompressed case and compression ratios similar to heavyweight formats like Gzip but better than lightweight formats like Snappy, providing significant performance benefits when data does not fit into memory
Methods
  • The major insights are: Operations Performance: CLA achieves in-memory matrix-vector multiply performance close to uncompressed.
  • Sparse-safe scalar and aggregate operations show huge improvements due to value-based computation.
  • Compression Ratio: CLA yields substantially better compression ratios than lightweight general-purpose compression.
  • CLA provides large end-to-end performance improvements, of up to 26x, when uncompressed or lightweight-compressed matrices do not fit in memory.
  • Effective Compression Planning: Sampling-based compression planning yields both reasonable compression time and good compression plans, i.e., good choices of encoding formats and co-coding schemes.
  • The authors obtain good compression ratios at costs that are amortized
Results
  • The differences of compressed sizes were less than 2.5% in all cases.
  • CLA shows similar operations performance at significantly better compression ratios which is crucial for end-to-end performance improvements of large-scale ML.
  • As long as the data fits in aggregated memory (Mnist80m, 180 GB), all runtimes are almost identical, with Snappy and CLA showing overheads of up to 25% and 10%, respectively.
  • For non-iterative algorithms, CLA is up to 32% slower while Snappy shows less than 12% overhead
Conclusion
  • The authors have initiated work on compressed linear algebra (CLA), in which matrices are compressed with lightweight techniques and linear algebra operations are performed directly over the compressed representation.
  • CLA generalizes sparse matrix representations, encoding both dense and sparse matrices in a universal compressed form.
  • CLA is broadly applicable to any system that provides blocked matrix representations, linear algebra, and physical data independence.
  • Interesting future work includes (1) full optimizer integration, (2) global planning and physical design tuning, (3) alternative compression schemes, and (4) operations beyond matrix-vector
Tables
  • Table1: Compression Ratios of Real Datasets
  • Table2: Overview ML Algorithm Core Operations
  • Table3: Compression Plans of Individual Datasets
  • Table4: Compression Ratio CSR-VI vs. CLA
  • Table5: Size Estimation Accuracy (Average ARE). Dataset Higgs Census Covtype ImageNet Mnist8m
  • Table6: Mnist8m Deserialized RDD Storage Size
  • Table7: End-to-End Performance Mnist40m/240m. Algorithm Mnist40m (90 GB) Mnist240m (540 GB)
  • Table8: End-to-End Performance ImageNet15/150. Algorithm ImageNet15 (65 GB) ImageNet150 (650 GB)
Download tables as Excel
Related work
  • We generalize sparse matrix representations via compression and accordingly review related work of database compression, sparse linear algebra, and compression planning.

    Compressed Databases: The notion of compressing databases appears in the literature back in the early 1980s [5, 18], although most early work focuses on the use of generalpurpose techniques like Huffman coding. An important exception is the Model 204 database system, which used compressed bitmap indexes to speed up query processing [38]. More recent systems that use bitmap-based compression include FastBit [49], Oracle [39], and Sybase IQ [45]. Graefe and Shapiro’s 1991 paper “Data Compression and Database Performance” more broadly introduced the idea of compression to improve query performance by evaluating queries in the compressed domain [23], primarily with dictionary-based compression. Westmann et al explored storage, query processing and optimization with regard to lightweight compression techniques [47]. Later, Raman and Swart investigated query processing over heavyweight Huffman coding schemes [40], where they have also shown the benefit of column co-coding. Recent examples of relational database systems that use multiple types of compression to speed up query processing include C-Store/Vertica [43], SAP HANA [11], IBM DB2 with BLU Acceleration [41], Microsoft SQL Server [36], and HyPer [35]. SciDB—as an array database— also uses compression but decompressed arrays block-wise for each operation [44]. Further, Kimura et al made a case for compression-aware physical design tuning to overcome suboptimal design choices [33], which requires to estimate sizes of compressed indexes. Existing estimators focus on compression schemes such as null suppression and dictionary encoding [29], where the latter is again related to estimating the number of distinct values. Other estimators focus on index layouts such as RID list and prefix key compression [9].
Reference
  • [2] A. Alexandrov et al. The Stratosphere Platform for Big Data Analytics. VLDB J., 23(6), 2014.
    Google ScholarLocate open access versionFindings
  • [3] A. Ashari et al. An Efficient Two-Dimensional Blocking Strategy for Sparse Matrix-Vector Multiplication on GPUs. In ICS (Intl. Conf. on Supercomputing), 2014.
    Google ScholarLocate open access versionFindings
  • [4] A. Ashari et al. On Optimizing Machine Learning Workloads via Kernel Fusion. In PPoPP (Principles and Practice of Parallel Programming), 2015.
    Google ScholarLocate open access versionFindings
  • [6] N. Bell and M. Garland. Implementing Sparse Matrix-Vector Multiplication on Throughput-Oriented Processors. In SC (Supercomputing Conf.), 2009.
    Google ScholarLocate open access versionFindings
  • [7] J. Bergstra et al. Theano: a CPU and GPU Math Expression Compiler. In SciPy, 2010.
    Google ScholarLocate open access versionFindings
  • [8] K. S. Beyer et al. On Synopses for Distinct-Value Estimation Under Multiset Operations. In SIGMOD, 2007.
    Google ScholarLocate open access versionFindings
  • [9] B. Bhattacharjee et al. Efficient Index Compression in DB2 LUW. PVLDB, 2(2), 2009.
    Google ScholarLocate open access versionFindings
  • [12] M. Boehm et al. Declarative Machine Learning – A Classification of Basic Properties and Types. CoRR, 2016.
    Google ScholarLocate open access versionFindings
  • [14] M. Charikar et al. Towards Estimation Error Guarantees for Distinct Values. In SIGMOD, 2000.
    Google ScholarLocate open access versionFindings
  • [15] R. Chitta et al. Approximate Kernel k-means: Solution to Large Scale Kernel Clustering. In KDD, 2011.
    Google ScholarLocate open access versionFindings
  • [16] J. Cohen et al. MAD Skills: New Analysis Practices for Big Data. PVLDB, 2(2), 2009.
    Google ScholarLocate open access versionFindings
  • [17] C. Constantinescu and M. Lu. Quick Estimation of Data Compression and De-duplication for Large Storage Systems. In CCP (Data Compression, Comm. and Process.), 2011.
    Google ScholarLocate open access versionFindings
  • [19] S. Das et al. Ricardo: Integrating R and Hadoop. In SIGMOD, 2010.
    Google ScholarLocate open access versionFindings
  • [20] J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI, 2004.
    Google ScholarLocate open access versionFindings
  • [21] A. Ghoting et al. SystemML: Declarative Machine Learning on MapReduce. In ICDE, 2011.
    Google ScholarLocate open access versionFindings
  • [23] G. Graefe and L. D. Shapiro. Data Compression and Database Performance. In Applied Computing, 1991.
    Google ScholarLocate open access versionFindings
  • [24] P. J. Haas and L. Stokes. Estimating the Number of Classes in a Finite Population. J. Amer. Statist. Assoc., 93(444), 1998.
    Google ScholarLocate open access versionFindings
  • [25] D. Harnik et al. Estimation of Deduplication Ratios in Large Data Sets. In MSST (Mass Storage Sys. Tech.), 2012.
    Google ScholarLocate open access versionFindings
  • [26] D. Harnik et al. To Zip or not to Zip: Effective Resource Usage for Real-Time Compression. In FAST, 2013.
    Google ScholarFindings
  • [27] B. Huang et al. Cumulon: Optimizing Statistical Data Analysis in the Cloud. In SIGMOD, 2013.
    Google ScholarLocate open access versionFindings
  • [28] B. Huang et al. Resource Elasticity for Large-Scale Machine Learning. In SIGMOD, 2015.
    Google ScholarLocate open access versionFindings
  • [29] S. Idreos et al. Estimating the Compression Fraction of an Index using Sampling. In ICDE, 2010.
    Google ScholarLocate open access versionFindings
  • [30] N. L. Johnson et al. Univariate Discrete Distributions. Wiley, New York, 2nd edition, 1992.
    Google ScholarFindings
  • [31] V. Karakasis et al. An Extended Compression Format for the Optimization of Sparse Matrix-Vector Multiplication. TPDS (Trans. Par. and Dist. Systems), 24(10), 2013.
    Google ScholarLocate open access versionFindings
  • [32] D. Kernert et al. SLACID - Sparse Linear Algebra in a Column-Oriented In-Memory Database System. In SSDBM, 2014.
    Google ScholarLocate open access versionFindings
  • [33] H. Kimura et al. Compression Aware Physical Database Design. PVLDB, 4(10), 2011.
    Google ScholarLocate open access versionFindings
  • [34] K. Kourtis et al. Optimizing Sparse Matrix-Vector Multiplication Using Index and Value Compression. In CF (Computing Frontiers), 2008.
    Google ScholarLocate open access versionFindings
  • [35] H. Lang et al. Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation. In SIGMOD, 2016.
    Google ScholarLocate open access versionFindings
  • [36] P. Larson et al. SQL Server Column Store Indexes. In SIGMOD, 2011.
    Google ScholarLocate open access versionFindings
  • [40] V. Raman and G. Swart. How to Wring a Table Dry: Entropy Compression of Relations and Querying of Compressed Relations. In VLDB, 2006.
    Google ScholarLocate open access versionFindings
  • [41] V. Raman et al. DB2 with BLU Acceleration: So Much More than Just a Column Store. PVLDB, 6(11), 2013.
    Google ScholarLocate open access versionFindings
  • [43] M. Stonebraker et al. C-Store: A Column-oriented DBMS. In VLDB, 2005.
    Google ScholarLocate open access versionFindings
  • [44] M. Stonebraker et al. The Architecture of SciDB. In SSDBM, 2011.
    Google ScholarLocate open access versionFindings
  • [46] G. Valiant and P. Valiant. Estimating the Unseen: An n/log(n)-sample Estimator for Entropy and Support Size, Shown Optimal via New CLTs. In STOC, 2011.
    Google ScholarLocate open access versionFindings
  • [47] T. Westmann et al. The Implementation and Performance of Compressed Databases. SIGMOD Record, 29(3), 2000.
    Google ScholarLocate open access versionFindings
  • [48] S. Williams et al. Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms. In SC (Supercomputing Conf.), 2007.
    Google ScholarFindings
  • [49] K. Wu et al. Optimizing Bitmap Indices With Efficient Compression. TODS, 31(1), 2006.
    Google ScholarLocate open access versionFindings
  • [50] L. Yu et al. Exploiting Matrix Dependency for Efficient Distributed Matrix Computation. In SIGMOD, 2015.
    Google ScholarLocate open access versionFindings
  • [51] M. Zaharia et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In NSDI, 2012.
    Google ScholarLocate open access versionFindings
  • [52] C. Zhang et al. Materialization Optimizations for Feature Selection Workloads. In SIGMOD, 2014.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科