Visible to the public Biblio

Filters: Keyword is Data-Intensive Applications  [Clear All Filters]
2021-02-15
Hu, X., Deng, C., Yuan, B..  2020.  Reduced-Complexity Singular Value Decomposition For Tucker Decomposition: Algorithm And Hardware. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :1793–1797.
Tensors, as the multidimensional generalization of matrices, are naturally suited for representing and processing high-dimensional data. To date, tensors have been widely adopted in various data-intensive applications, such as machine learning and big data analysis. However, due to the inherent large-size characteristics of tensors, tensor algorithms, as the approaches that synthesize, transform or decompose tensors, are very computation and storage expensive, thereby hindering the potential further adoptions of tensors in many application scenarios, especially on the resource-constrained hardware platforms. In this paper, we propose a reduced-complexity SVD (Singular Vector Decomposition) scheme, which serves as the key operation in Tucker decomposition. By using iterative self-multiplication, the proposed scheme can significantly reduce the storage and computational costs of SVD, thereby reducing the complexity of the overall process. Then, corresponding hardware architecture is developed with 28nm CMOS technology. Our synthesized design can achieve 102GOPS with 1.09 mm2 area and 37.6 mW power consumption, and thereby providing a promising solution for accelerating Tucker decomposition.
2020-12-11
Sabek, I., Chandramouli, B., Minhas, U. F..  2019.  CRA: Enabling Data-Intensive Applications in Containerized Environments. 2019 IEEE 35th International Conference on Data Engineering (ICDE). :1762—1765.
Today, a modern data center hosts a wide variety of applications comprising batch, interactive, machine learning, and streaming applications. In this paper, we factor out the commonalities in a large majority of these applications, into a generic dataflow layer called Common Runtime for Applications (CRA). In parallel, another trend, with containerization technologies (e.g., Docker), has taken a serious hold on cloud-scale data centers, with direct implications on building next generation of data center applications. Container orchestrators (e.g., Kubernetes) have made deployment a lot easy, and they solve many infrastructure level problems, e.g., service discovery, auto-restart, and replication. For best in class performance, there is a need to marry the next generation applications with containerization technologies. To that end, CRA leverages and builds upon the containerization and resource orchestration capabilities of Kubernetes/Docker, and makes it easy to build a wide range of cloud-edge applications on top. To the best of our knowledge, we are the first to present a cloud native runtime for building data center applications. We show the efficiency of CRA through various micro-benchmarking experiments.
2020-03-30
Tabassum, Anika, Nady, Anannya Islam, Rezwanul Huq, Mohammad.  2019.  Mathematical Formulation and Implementation of Query Inversion Techniques in RDBMS for Tracking Data Provenance. 2019 7th International Conference on Information and Communication Technology (ICoICT). :1–6.
Nowadays the massive amount of data is produced from different sources and lots of applications are processing these data to discover insights. Sometimes we may get unexpected results from these applications and it is not feasible to trace back to the data origin manually to find the source of errors. To avoid this problem, data must be accompanied by the context of how they are processed and analyzed. Especially, data-intensive applications like e-Science always require transparency and therefore, we need to understand how data has been processed and transformed. In this paper, we propose mathematical formulation and implementation of query inversion techniques to trace the provenance of data in a relational database management system (RDBMS). We build mathematical formulations of inverse queries for most of the relational algebra operations and show the formula for join operations in this paper. We, then, implement these formulas of inversion techniques and the experiment shows that our proposed inverse queries can successfully trace back to original data i.e. finding data provenance.
2020-02-18
Quan, Guocong, Tan, Jian, Eryilmaz, Atilla.  2019.  Counterintuitive Characteristics of Optimal Distributed LRU Caching Over Unreliable Channels. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications. :694–702.
Least-recently-used (LRU) caching and its variants have conventionally been used as a fundamental and critical method to ensure fast and efficient data access in computer and communication systems. Emerging data-intensive applications over unreliable channels, e.g., mobile edge computing and wireless content delivery networks, have imposed new challenges in optimizing LRU caching systems in environments prone to failures. Most existing studies focus on reliable channels, e.g., on wired Web servers and within data centers, which have already yielded good insights with successful algorithms on how to reduce cache miss ratios. Surprisingly, we show that these widely held insights do not necessarily hold true for unreliable channels. We consider a single-hop multi-cache distributed system with data items being dispatched by random hashing. The objective is to achieve efficient cache organization and data placement. The former allocates the total memory space to each of the involved caches. The latter decides data routing strategies and data replication schemes. Analytically we characterize the unreliable LRU caches by explicitly deriving their asymptotic miss probabilities. Based on these results, we optimize the system design. Remarkably, these results sometimes are counterintuitive, differing from the ones obtained for reliable caches. We discover an interesting phenomenon: asymmetric cache organization is optimal even for symmetric channels. Specifically, even when channel unreliability probabilities are equal, allocating the cache spaces unequally can achieve a better performance. We also propose an explicit unequal allocation policy that outperforms the equal allocation. In addition, we prove that splitting the total cache space into separate LRU caches can achieve a lower asymptotic miss probability than resource pooling that organizes the total space in a single LRU cache. These results provide new and even counterintuitive insights that motivate novel designs for caching systems over unreliable channels. They can potentially be exploited to further improve the system performance in real practice.
2018-12-03
Liu, Yin, Song, Zheng, Tilevich, Eli.  2017.  Querying Invisible Objects: Supporting Data-Driven, Privacy-Preserving Distributed Applications. Proceedings of the 14th International Conference on Managed Languages and Runtimes. :60–72.

When transferring sensitive data to a non-trusted party, end-users require that the data be kept private. Mobile and IoT application developers want to leverage the sensitive data to provide better user experience and intelligent services. Unfortunately, existing programming abstractions make it impossible to reconcile these two seemingly conflicting objectives. In this paper, we present a novel programming mechanism for distributed managed execution environments that hides sensitive user data, while enabling developers to build powerful and intelligent applications, driven by the properties of the sensitive data. Specifically, the sensitive data is never revealed to clients, being protected by the runtime system. Our abstractions provide declarative and configurable data query interfaces, enforced by a lightweight distributed runtime system. Developers define when and how clients can query the sensitive data's properties (i.e., how long the data remains accessible, how many times its properties can be queried, which data query methods apply, etc.). Based on our evaluation, we argue that integrating our novel mechanism with the Java Virtual Machine (JVM) can address some of the most pertinent privacy problems of IoT and mobile applications.