Executing cyclic scientific workflows in the cloud

Our paper with the title “Executing cyclic scientific workflows in the cloud” has just been published in Springer’s Journal of Cloud Computing. In this paper, we present an algorithm and a software architecture for a cloud-based system that executes cyclic scientific workflows whose structure may change during run time.

Existing approaches either rely on workflow definitions based on directed acyclic graphs (DAGs) or require workarounds to implement cyclic structures. In contrast, our system supports cycles natively, avoids workarounds, and as such reduces the complexity of workflow modelling and maintenance. Our algorithm traverses workflow graphs and transforms them iteratively into linear sequences of executable actions. We call these sequences process chains. Our software architecture distributes the process chains to multiple compute nodes in the cloud and oversees their execution.

We evaluate our approach by applying it to two practical use cases from the domains of astronomy and engineering. We also compare it with two existing workflow management systems. The evaluation demonstrates that our algorithm is able to execute dynamically changing workflows with cycles and that design and maintenance of complex workflows is easier than with existing solutions. It also shows that our software architecture can run process chains on multiple compute nodes in parallel to significantly speed up the workflow execution.

An implementation of our algorithm and the software architecture is available with the Steep Workflow Management System that we released under an open-source license. The resources for the first practical use case are also available as open source for reproduction.

Reference

Krämer, M., Würz, H. M., & Altenhofen, C. (2021). Executing cyclic scientific workflows in the cloud. Journal of Cloud Computing, 10(25), 1–26. https://doi.org/10.1186/s13677-021-00229-7
[ | PDF ]

Download

The paper has been published under the CC-BY 4.0 license. You may download the final manuscript here.


Profile image of Michel Krämer

Posted by Michel Krämer
on 7 April 2021


Next post

Steep 5.8.0

I’m thrilled to announce the new version of the scientific workflow management system Steep. This release contains many features including the possibility to resume process chains after a scheduler instance has crashed.

Previous post

Sudocle: A modern web app for Sudoku

As a huge fan of Sudoku, I’m extremely happy to announce the first version of Sudocle, a web app inspired by “Cracking the Cryptic”. The app is lightweight and has a clean look, which makes solving Sudoku puzzles more fun than ever!

Related posts

Two new cloud-based data processing papers published

My latest research papers about “Capability-based Scheduling of Scientific Workflows in the Cloud” and “Scalable processing of massive geodata in the cloud” are now available.

Steep - Run Scientific Workflows in the Cloud

I’m thrilled to announce that the workflow management system I’ve been working on for the last couple of years is now open-source! Read more about Steep and its features in this blog post.

Steep 5.7.0

I’ve just released a new version of my scientific workflow management system Steep. It introduces live process chain logs, improved VM management, and many other new features. This post summarises all changes.