Uddin, M. Y. S., Venkatasubramanian, N..
2018.
Edge Caching for Enriched Notifications Delivery in Big Active Data. 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS). :696–705.
In this paper, we propose a set of caching strategies for big active data (BAD) systems. BAD is a data management paradigm that allows ingestion of massive amount of data from heterogeneous sources, such as sensor data, social networks, web and crowdsourced data in a large data cluster consisting of many computing and storage nodes, and enables a very large number of end users to subscribe to those data items through declarative subscriptions. A set of distributed broker nodes connect these end users to the backend data cluster, manage their subscriptions and deliver the subscription results to the end users. Unlike the most traditional publish-subscribe systems that match subscriptions against a single stream of publications to generate notifications, BAD can match subscriptions across multiple publications (by leveraging storage in the backend) and thus can enrich notifications with a rich set of diverse contents. As the matched results are delivered to the end users through the brokers, the broker node caches the results for a while so that the subscribers can retrieve them with reduced latency. Interesting research questions arise in this context so as to determine which result objects to cache or drop when the cache becomes full (eviction-based caching) or to admit objects with an explicit expiration time indicating how much time they should reside in the cache (TTL based caching). To this end, we propose a set of caching strategies for the brokers and show that the schemes achieve varying degree of efficiency in terms of notification delivery in the BAD system. We evaluate our schemes via a prototype implementation and through detailed simulation studies.