Biblio
Filters: Keyword is data items [Clear All Filters]
Counterintuitive Characteristics of Optimal Distributed LRU Caching Over Unreliable Channels. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications. :694–702.
.
2019. Least-recently-used (LRU) caching and its variants have conventionally been used as a fundamental and critical method to ensure fast and efficient data access in computer and communication systems. Emerging data-intensive applications over unreliable channels, e.g., mobile edge computing and wireless content delivery networks, have imposed new challenges in optimizing LRU caching systems in environments prone to failures. Most existing studies focus on reliable channels, e.g., on wired Web servers and within data centers, which have already yielded good insights with successful algorithms on how to reduce cache miss ratios. Surprisingly, we show that these widely held insights do not necessarily hold true for unreliable channels. We consider a single-hop multi-cache distributed system with data items being dispatched by random hashing. The objective is to achieve efficient cache organization and data placement. The former allocates the total memory space to each of the involved caches. The latter decides data routing strategies and data replication schemes. Analytically we characterize the unreliable LRU caches by explicitly deriving their asymptotic miss probabilities. Based on these results, we optimize the system design. Remarkably, these results sometimes are counterintuitive, differing from the ones obtained for reliable caches. We discover an interesting phenomenon: asymmetric cache organization is optimal even for symmetric channels. Specifically, even when channel unreliability probabilities are equal, allocating the cache spaces unequally can achieve a better performance. We also propose an explicit unequal allocation policy that outperforms the equal allocation. In addition, we prove that splitting the total cache space into separate LRU caches can achieve a lower asymptotic miss probability than resource pooling that organizes the total space in a single LRU cache. These results provide new and even counterintuitive insights that motivate novel designs for caching systems over unreliable channels. They can potentially be exploited to further improve the system performance in real practice.
Edge Caching for Enriched Notifications Delivery in Big Active Data. 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS). :696–705.
.
2018. In this paper, we propose a set of caching strategies for big active data (BAD) systems. BAD is a data management paradigm that allows ingestion of massive amount of data from heterogeneous sources, such as sensor data, social networks, web and crowdsourced data in a large data cluster consisting of many computing and storage nodes, and enables a very large number of end users to subscribe to those data items through declarative subscriptions. A set of distributed broker nodes connect these end users to the backend data cluster, manage their subscriptions and deliver the subscription results to the end users. Unlike the most traditional publish-subscribe systems that match subscriptions against a single stream of publications to generate notifications, BAD can match subscriptions across multiple publications (by leveraging storage in the backend) and thus can enrich notifications with a rich set of diverse contents. As the matched results are delivered to the end users through the brokers, the broker node caches the results for a while so that the subscribers can retrieve them with reduced latency. Interesting research questions arise in this context so as to determine which result objects to cache or drop when the cache becomes full (eviction-based caching) or to admit objects with an explicit expiration time indicating how much time they should reside in the cache (TTL based caching). To this end, we propose a set of caching strategies for the brokers and show that the schemes achieve varying degree of efficiency in terms of notification delivery in the BAD system. We evaluate our schemes via a prototype implementation and through detailed simulation studies.