Visible to the public A Novel Data Placement Strategy for Data-Sharing Scientific Workflows in Heterogeneous Edge-Cloud Computing Environments

TitleA Novel Data Placement Strategy for Data-Sharing Scientific Workflows in Heterogeneous Edge-Cloud Computing Environments
Publication TypeConference Paper
Year of Publication2020
AuthorsDu, Xin, Tang, Songtao, Lu, Zhihui, Wet, Jie, Gai, Keke, Hung, Patrick C.K.
Conference Name2020 IEEE International Conference on Web Services (ICWS)
Date PublishedOct. 2020
PublisherIEEE
ISBN Number978-1-7281-8787-7
Keywordscloud computing, compositionality, Computational modeling, Data models, data placement, data-sharing, delays, Distributed databases, Heterogeneous edge-cloud computing environments, Metrics, particle swarm optimization, pubcrawl, resilience, Resiliency, Scientific Computing Security, scientific workflows, web services
AbstractThe deployment of datasets in the heterogeneous edge-cloud computing paradigm has received increasing attention in state-of-the-art research. However, due to their large sizes and the existence of private scientific datasets, finding an optimal data placement strategy that can minimize data transmission as well as improve performance, remains a persistent problem. In this study, the advantages of both edge and cloud computing are combined to construct a data placement model that works for multiple scientific workflows. Apparently, the most difficult research challenge is to provide a data placement strategy to consider shared datasets, both within individual and among multiple workflows, across various geographically distributed environments. According to the constructed model, not only the storage capacity of edge micro-datacenters, but also the data transfer between multiple clouds across regions must be considered. To address this issue, we considered the characteristics of this model and identified the factors that are causing the transmission delay. The authors propose using a discrete particle swarm optimization algorithm with differential evolution (DE-DPSO) to distribute dataset during workflow execution. Based on this, a new data placement strategy named DE-DPSO-DPS is proposed. DE-DPSO-DPS is evaluated using several experiments designed in simulated heterogeneous edge-cloud computing environments. The results demonstrate that our data placement strategy can effectively reduce the data transmission time and achieve superior performance as compared to traditional strategies for data-sharing scientific workflows.
URLhttps://ieeexplore.ieee.org/document/9284088
DOI10.1109/ICWS49710.2020.00073
Citation Keydu_novel_2020