Steep 5.8.0

I’ve just released the new version 5.8.0 of my scientific workflow management system Steep. One of the highlights of this version is increased fault tolerance: Steep schedulers are now able to resume the work of other crashed instances when they detect that a process chain is running but not monitored anymore. Steep 5.8.0 also includes many other new features (see details below). The version has been thoroughly tested in practise over the last couple of months.

Steep is a scientific workflow management system that can execute data-driven workflows in the Cloud. It is very well suited to harness the possibilities of distributed computing in order to parallelise work and to speed up your data processing workflows, no matter how complex they are and regardless of how much data you need to process. Steep is an open-source software developed at Fraunhofer IGD. You can download the binaries and the source code of Steep from its GitHub repository.

Resume monitoring of running process chains

Steep’s scheduler assigns process chains to agents for execution. It then monitors the execution and finally writes the process chain results to the database.

In the past, in case a scheduler instance crashed, the results of running process chains were not collected. Instead, another Steep instance had to resume the workflow and execute the process chains that were still running while the old scheduler crashed again from the beginning.

In version 5.8.0, a new scheduler instance can detect orphaned process chains and resume monitoring. Orphaned process chains are those that are currently being executed by an agent but not being monitored by any scheduler instance in the cluster.

This feature increases fault tolerance and allows for a workflow execution without interruptions and without the necessity to unnecessarily repeat process chains.

The scheduler looks for orphaned process chains at startup and in a set interval. Please have a look at Steep’s documentation for information on how to configure this new feature.

New features for plugins

The interface for progress estimator plugins in the new version now supports multiple service IDs. This means you can use the same plugin to determine the progress of more than one service:

- name: myGenericProgressEstimator
  type: progressEstimator
  scriptFile: conf/plugins/myGenericProgressEstimator.kt
  supportedServiceIds:
    - myService
    - anotherService

In addition, it is now possible to declare dependencies between plugins, for example, if a plugin should be executed after another one. This applies to process chain adapters and initializers:

- name: myProcessChainAdapter
  type: processChainAdapter
  scriptFile: conf/plugins/myProcessChainAdapter.kt
  dependsOn:
    - anotherProcessChainAdapter

Finally, plugin script files can now be pre-compiled to speed up Steep’s startup time. Please read the documentation for more information.

Improved configuration

Configuration files for setups (setups.yaml) and service metadata (services/services.yaml) often contain repeated information (e.g. two setups might offer the same capabilities). With the new version, you can now use YAML anchors to simplify your configuration files.

In addition, the configuration properties steep.services and steep.plugins now support glob patterns (e.g. **/*.yaml). This allows you to, for example, recursively include other configuration files from a directory tree without having to specify each of them individually.

Provisioning scripts now support a new upload function that can be used to upload one or more files to a virtual machine. Read the new section on provisioning scripts in Steep’s documentation for more information.

Other new features

Here’s a list of other noteworthy improvements and bug fixes:

Display allocated process chain ID in agent details
Reduce the number of log messages, in particular when agents join or leave the cluster
Run PostgreSQL database migration only once per Steep instance
Improve graceful shutdown
Add possibility to restore cluster members on startup from VM registry

Maintenance

Run unit tests in parallel
Speed up MongoDB unit tests
Update web UI dependencies

Bug fixes

Allow runtime plugins to access current Vert.x context
Do not fail to delete OpenStack block device if it does not exist

Posted by Michel Krämer
on 19 May 2021

Steep 5.8.0

Resume monitoring of running process chains

New features for plugins

Improved configuration

Other new features

Maintenance

Bug fixes

Next post

Efficient scheduling of workflow actions in the cloud

Previous post

Executing cyclic scientific workflows in the cloud

Related posts

Steep 5.6.0

Steep - Run Scientific Workflows in the Cloud

Steep 5.7.0

Steep 5.8.0self.__wrap_n!=1&&self.__wrap_b(":R1d9ukq:",1)

Resume monitoring of running process chainsself.__wrap_n!=1&&self.__wrap_b(":Rmjqd9ukq:",1)

New features for pluginsself.__wrap_n!=1&&self.__wrap_b(":R26jqd9ukq:",1)

Improved configurationself.__wrap_n!=1&&self.__wrap_b(":R3mjqd9ukq:",1)

Other new featuresself.__wrap_n!=1&&self.__wrap_b(":R4mjqd9ukq:",1)

Maintenanceself.__wrap_n!=1&&self.__wrap_b(":R5ejqd9ukq:",1)

Bug fixesself.__wrap_n!=1&&self.__wrap_b(":R5ujqd9ukq:",1)

Next postself.__wrap_n!=1&&self.__wrap_b(":R5l9ukq:",1)

Efficient scheduling of workflow actions in the cloudself.__wrap_n!=1&&self.__wrap_b(":R9l9ukq:",1)

Previous postself.__wrap_n!=1&&self.__wrap_b(":R7l9ukq:",1)

Executing cyclic scientific workflows in the cloudself.__wrap_n!=1&&self.__wrap_b(":Rbl9ukq:",1)

Related postsself.__wrap_n!=1&&self.__wrap_b(":R1t9ukq:",1)

Steep 5.6.0self.__wrap_n!=1&&self.__wrap_b(":R16t9ukq:",1)

Steep - Run Scientific Workflows in the Cloudself.__wrap_n!=1&&self.__wrap_b(":R1at9ukq:",1)

Steep 5.7.0self.__wrap_n!=1&&self.__wrap_b(":R1et9ukq:",1)

Steep 5.8.0

Resume monitoring of running process chains

New features for plugins

Improved configuration

Other new features

Maintenance

Bug fixes

Next post

Efficient scheduling of workflow actions in the cloud

Previous post

Executing cyclic scientific workflows in the cloud

Related posts

Steep 5.6.0

Steep - Run Scientific Workflows in the Cloud

Steep 5.7.0