Visible to the public Biblio

Filters: Keyword is data placement  [Clear All Filters]
2021-09-16
Du, Xin, Tang, Songtao, Lu, Zhihui, Wet, Jie, Gai, Keke, Hung, Patrick C.K..  2020.  A Novel Data Placement Strategy for Data-Sharing Scientific Workflows in Heterogeneous Edge-Cloud Computing Environments. 2020 IEEE International Conference on Web Services (ICWS). :498–507.
The deployment of datasets in the heterogeneous edge-cloud computing paradigm has received increasing attention in state-of-the-art research. However, due to their large sizes and the existence of private scientific datasets, finding an optimal data placement strategy that can minimize data transmission as well as improve performance, remains a persistent problem. In this study, the advantages of both edge and cloud computing are combined to construct a data placement model that works for multiple scientific workflows. Apparently, the most difficult research challenge is to provide a data placement strategy to consider shared datasets, both within individual and among multiple workflows, across various geographically distributed environments. According to the constructed model, not only the storage capacity of edge micro-datacenters, but also the data transfer between multiple clouds across regions must be considered. To address this issue, we considered the characteristics of this model and identified the factors that are causing the transmission delay. The authors propose using a discrete particle swarm optimization algorithm with differential evolution (DE-DPSO) to distribute dataset during workflow execution. Based on this, a new data placement strategy named DE-DPSO-DPS is proposed. DE-DPSO-DPS is evaluated using several experiments designed in simulated heterogeneous edge-cloud computing environments. The results demonstrate that our data placement strategy can effectively reduce the data transmission time and achieve superior performance as compared to traditional strategies for data-sharing scientific workflows.
2020-02-18
Quan, Guocong, Tan, Jian, Eryilmaz, Atilla.  2019.  Counterintuitive Characteristics of Optimal Distributed LRU Caching Over Unreliable Channels. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications. :694–702.
Least-recently-used (LRU) caching and its variants have conventionally been used as a fundamental and critical method to ensure fast and efficient data access in computer and communication systems. Emerging data-intensive applications over unreliable channels, e.g., mobile edge computing and wireless content delivery networks, have imposed new challenges in optimizing LRU caching systems in environments prone to failures. Most existing studies focus on reliable channels, e.g., on wired Web servers and within data centers, which have already yielded good insights with successful algorithms on how to reduce cache miss ratios. Surprisingly, we show that these widely held insights do not necessarily hold true for unreliable channels. We consider a single-hop multi-cache distributed system with data items being dispatched by random hashing. The objective is to achieve efficient cache organization and data placement. The former allocates the total memory space to each of the involved caches. The latter decides data routing strategies and data replication schemes. Analytically we characterize the unreliable LRU caches by explicitly deriving their asymptotic miss probabilities. Based on these results, we optimize the system design. Remarkably, these results sometimes are counterintuitive, differing from the ones obtained for reliable caches. We discover an interesting phenomenon: asymmetric cache organization is optimal even for symmetric channels. Specifically, even when channel unreliability probabilities are equal, allocating the cache spaces unequally can achieve a better performance. We also propose an explicit unequal allocation policy that outperforms the equal allocation. In addition, we prove that splitting the total cache space into separate LRU caches can achieve a lower asymptotic miss probability than resource pooling that organizes the total space in a single LRU cache. These results provide new and even counterintuitive insights that motivate novel designs for caching systems over unreliable channels. They can potentially be exploited to further improve the system performance in real practice.
2015-05-05
Koyanagi, T., Shinjo, Y..  2014.  A fast and compact hybrid memory resident datastore for text analytics with autonomic memory allocation. Information and Communication Systems (ICICS), 2014 5th International Conference on. :1-7.

This paper describes a high-performance and space-efficient memory-resident datastore for text analytics systems based on a hash table for fast access, a dynamic trie for staging and a list of Level-Order Unary Degree Sequence (LOUDS) tries for compactness. We achieve efficient memory allocation and data placement by placing freqently access keys in the hash table, and infrequently accessed keys in the LOUDS tries without using conventional cache algorithms. Our algorithm also dynamically changes memory allocation sizes for these data structures according to the remaining available memory size. This technique yields 38.6% to 52.9% better throughput than a double array trie - a conventional fast and compact datastore.