Visible to the public Evaluating Scientific Workflow Engines for Data and Compute Intensive Discoveries

TitleEvaluating Scientific Workflow Engines for Data and Compute Intensive Discoveries
Publication TypeConference Paper
Year of Publication2019
AuthorsSingh, Rina, Graves, Jeffrey A., Anantharaj, Valentine, Sukumar, Sreenivas R.
Conference Name2019 IEEE International Conference on Big Data (Big Data)
KeywordsAnalytical models, compositionality, Computational modeling, Converged Workloads, Data Intensive Discoveries, Data models, End-to-End Workflows, Engines, Predictive Metrics, pubcrawl, resilience, Scientific Computing Security, Scientific Experiments, scientific workflows, Software, Task Analysis, Tools, Workflow Engines
AbstractWorkflow engines used to script scientific experiments involving numerical simulation, data analysis, instruments, edge sensors, and artificial intelligence have to deal with the complexities of hardware, software, resource availability, and the collaborative nature of science. In this paper, we survey workflow engines used in data-intensive and compute-intensive discovery pipelines from scientific disciplines such as astronomy, high energy physics, earth system science, bio-medicine, and material science and present a qualitative analysis of their respective capabilities. We compare 5 popular workflow engines and their differentiated approach to job orchestration, job launching, data management and provenance, security authentication, ease-ofuse, workflow description, and scripting semantics. The comparisons presented in this paper allow practitioners to choose the appropriate engine for their scientific experiment and lead to recommendations for future work.
DOI10.1109/BigData47090.2019.9006223
Citation Keysingh_evaluating_2019