Pseudorandom Hashing for Space-bounded Computation with Applications in Streaming

2023 IEEE 64TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, FOCS(2023)

引用 3|浏览16
暂无评分
摘要
We revisit Nisan's classical pseudorandom generator (PRG) for space-bounded computation (STOC 1990) and its applications in streaming algorithms. We describe a new generator, HashPRG, that can be thought of as a symmetric version of Nisan's generator over larger alphabets. Our generator allows a trade-off between seed length and the time needed to compute a given block of the generator's output. HashPRG can be used to obtain derandomizations with much better update time and without sacrificing space for a large number of data stream algorithms, for example: Andoni's F-p estimation algorithm for constant p > 2 (ICASSP, 2017) assumes a random oracle, but achieves optimal space and constant update time. Using HashPRG's time-space trade-off we eliminate the random oracle assumption while preserving the other properties. Previously no time-optimal derandomization was known. Using similar techniques, we give an algorithm for a relaxed version of l(p) sampling in a turnstile stream. Both of our algorithms use (O) over tilde (d(1-2/p)) bits of space and have O(1) update time. For 0 < p < 2, the 1 +/- epsilon approximate F-p estimation algorithm of Kane et al., (STOC, 2011) uses an optimal O(epsilon(-2) log d) bits of space but has an update time of O(log(2)(1/epsilon) log log(1/epsilon)). Using HashPRG, we show that if 1/root d <= epsilon <= 1/d(c) for an arbitrarily small constant c > 0, then we can obtain a 1 +/- epsilon approximate F-p estimation algorithm that uses the optimal O(epsilon(-2) log d) bits of space and has an update time of O(log d) in the Word RAM model, which is more than a quadratic improvement in the update time. We obtain similar improvements for entropy estimation. CountSketch, with the fine-grained error analysis of Minton and Price (SODA, 2014). For derandomization, they suggested a direct application of Nisan's generator, yielding a logarithmic multiplicative space overhead. With HashPRG we obtain an efficient derandomization yielding the same asymptotic space as when assuming a random oracle. Our ability to obtain a time-efficient derandomization makes crucial use of HashPRG's symmetry. We also give the first derandomization of a recent private version of CountSketch. For a d-dimensional vector x being updated in a turnstile stream, we show that parallel to x parallel to(infinity) can be estimated up to an additive error of epsilon parallel to x parallel to(2) using O(epsilon(-2) log(1/epsilon) log d) bits of space. Additionally, the update time of this algorithm is O(log 1/epsilon) in the Word RAM model. We show that the space complexity of this algorithm is optimal up to constant factors. However, for vectors x with parallel to x parallel to(infinity) = Theta(parallel to x parallel to(2)), we show that the lower bound can be broken by giving an algorithm that uses O(epsilon(-2) log d) bits of space which approximates parallel to x parallel to(infinity) up to an additive error of epsilon parallel to x parallel to(2). We use our aforementioned derandomization of the CountSketch data structure to obtain this algorithm, and using the time-space trade off of HashPRG, we show that the update time of this algorithm is also O(log 1/epsilon) in the Word RAM model.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要