Visible to the public Biblio

Filters: Keyword is randomized algorithms  [Clear All Filters]
2018-08-23
Zhang, Kai, Liu, Chuanren, Zhang, Jie, Xiong, Hui, Xing, Eric, Ye, Jieping.  2017.  Randomization or Condensation?: Linear-Cost Matrix Sketching Via Cascaded Compression Sampling Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. :615–623.
Matrix sketching is aimed at finding compact representations of a matrix while simultaneously preserving most of its properties, which is a fundamental building block in modern scientific computing. Randomized algorithms represent state-of-the-art and have attracted huge interest from the fields of machine learning, data mining, and theoretic computer science. However, it still requires the use of the entire input matrix in producing desired factorizations, which can be a major computational and memory bottleneck in truly large problems. In this paper, we uncover an interesting theoretic connection between matrix low-rank decomposition and lossy signal compression, based on which a cascaded compression sampling framework is devised to approximate an m-by-n matrix in only O(m+n) time and space. Indeed, the proposed method accesses only a small number of matrix rows and columns, which significantly improves the memory footprint. Meanwhile, by sequentially teaming two rounds of approximation procedures and upgrading the sampling strategy from a uniform probability to more sophisticated, encoding-orientated sampling, significant algorithmic boosting is achieved to uncover more granular structures in the data. Empirical results on a wide spectrum of real-world, large-scale matrices show that by taking only linear time and space, the accuracy of our method rivals those state-of-the-art randomized algorithms consuming a quadratic, O(mn), amount of resources.
2017-10-13
Denysyuk, Oksana, Woelfel, Philipp.  2016.  Are Shared Objects Composable Under an Oblivious Adversary? Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing. :335–344.

Linearizability [5] of a concurrent object ensures that operations on that object appear to execute atomically. It is well known that linearizable implementations are composable: in an algorithm designed to work with atomic objects, replacing any atomic object with a linearizable implementation preserves the correctness of the original algorithm. However, replacing atomic objects with linearizable ones in a randomized algorithm can break the original probabilistic guarantees [3]. With an adaptive adversary, this problem is solved by using strongly linearizable [3] objects in the composition. How about with an oblivious adversary. In this paper, we ask the fundamental question of what property makes implementations composable under an oblivious adversary. It turns out that the property depends on the entire collection of objects used in the algorithm. We show that the composition of every randomized algorithm with a collection of linearizable objects OL is sound if and only if OL satisfies a property called library homogeneity. Roughly, this property says that, for each process, every operation on OL has the same length and linearization point. This result has several important implications. First, for an oblivious adversary, there is nothing analogous to linearizability to ensure that the atomic objects of an algorithm can be replaced with their implementations. Second, in general, algorithms cannot use implemented objects alongside atomic objects provided by the system, such as registers. These results show that, with an oblivious adversary, it is much harder to implement reusable object types than previously believed.

2017-05-16
Shrivastava, Anshumali, Konig, Arnd Christian, Bilenko, Mikhail.  2016.  Time Adaptive Sketches (Ada-Sketches) for Summarizing Data Streams. Proceedings of the 2016 International Conference on Management of Data. :1417–1432.

Obtaining frequency information of data streams, in limited space, is a well-recognized problem in literature. A number of recent practical applications (such as those in computational advertising) require temporally-aware solutions: obtaining historical count statistics for both time-points as well as time-ranges. In these scenarios, accuracy of estimates is typically more important for recent instances than for older ones; we call this desirable property Time Adaptiveness. With this observation, [20] introduced the Hokusai technique based on count-min sketches for estimating the frequency of any given item at any given time. The proposed approach is problematic in practice, as its memory requirements grow linearly with time, and it produces discontinuities in the estimation accuracy. In this work, we describe a new method, Time-adaptive Sketches, (Ada-sketch), that overcomes these limitations, while extending and providing a strict generalization of several popular sketching algorithms. The core idea of our method is inspired by the well-known digital Dolby noise reduction procedure that dates back to the 1960s. The theoretical analysis presented could be of independent interest in itself, as it provides clear results for the time-adaptive nature of the errors. An experimental evaluation on real streaming datasets demonstrates the superiority of the described method over Hokusai in estimating point and range queries over time. The method is simple to implement and offers a variety of design choices for future extensions. The simplicity of the procedure and the method's generalization of classic sketching techniques give hope for wide applicability of Ada-sketches in practice.