Steep - Run Scientific Workflows in the Cloud

For the last couple of years, I’ve been work­ing on a sci­entific work­flow man­age­ment sys­tem called Steep. I’m more than happy to an­nounce that, here at Fraunhofer IGD, we de­cided to make Steep Open-Source! You can down­load the bin­ar­ies and the source code of Steep from its Git­Hub re­pos­it­ory.

A sci­entific work­flow man­age­ment sys­tem is an ap­plic­a­tion that can ex­ecute data-driven work­flows, which ap­ply a series of pro­cessing ser­vices (of­ten called ac­tions, tasks, or mi­croservices) to in­put files to pro­duce a cer­tain out­put. Steep is de­signed to be scal­able. It can run on your laptop but also in the Cloud, in a Grid, or a Cluster and dis­trib­ute the in­di­vidual pro­cessing ser­vices to mul­tiple com­pute nodes. As such, it is very well suited to har­ness the pos­sib­il­it­ies of dis­trib­uted com­put­ing in or­der to par­al­lel­ise work and to speed up your data pro­cessing work­flows no mat­ter how com­plex they are and re­gard­less of how much data you need to pro­cess.

The fol­low­ing is a list of Steep’s key fea­tures:

Cyclic workflow graphs

In con­trast to other sci­entific work­flow man­age­ment sys­tems, Steep sup­ports cyc­lic work­flow graphs without a pri­ori runtime know­ledge. Work­flows are con­ver­ted in­cre­ment­ally and on-de­mand to so-called pro­cess chains. This al­lows Steep to ex­ecute work­flows that dy­nam­ic­ally change its struc­ture dur­ing runtime.

Capability-based scheduling

Steep has an op­tim­ized, cap­ab­il­ity-based sched­uler that ex­ecutes pro­cess chains in par­al­lel by dis­trib­ut­ing them to mul­tiple agents (i.e. Steep in­stances run­ning in the Cloud or in a Cluster).

Automatic failover

Crashed work­flows can be re­sumed without loss of in­form­a­tion—even if no data­base is con­figured.

Microservice integration

In or­der to be able to in­teg­rate and ex­ecute pro­cessing ser­vices or mi­croservices with al­most ar­bit­rary in­ter­faces, Steep makes use of so-called ser­vice metadata. You can use this metadata to de­scribe the in­ter­face of your ex­ist­ing bin­ar­ies without the need to modify them and to match them to a cer­tain pro­cessing frame­work.

Runtime environments

Steep has built-in runtime en­vir­on­ments for ex­ecut­able mi­croservices that are provided as bin­ar­ies or Docker im­ages.

Plugins

Plu­gins let you modify gen­er­ated pro­cess chains, the way agents col­lect res­ults, or add cus­tom runtime en­vir­on­ments (e.g. Py­thon, AWS Lambda, Web Pro­cessing Ser­vices).

Optional databases

If you want your work­flows to be per­sisted in a data­base over a long period of time, Steep of­fers sup­port for Mon­goDB and Post­gr­eSQL. In the de­fault con­fig­ur­a­tion, Steep re­quires no data­base and keeps everything in memory. Note that, even in this setup, in­di­vidual Steep in­stances in your cluster share the same memory, which already provides a good fault tol­er­ance.

HTTP and web interfaces

Steep has a REST-like HTTP in­ter­face, a web-based user in­ter­face for mon­it­or­ing, and can provide met­rics to Pro­meth­eus.

Scalability

Its asyn­chron­ous event-driven ar­chi­tec­ture al­lows Steep to scale ho­ri­zont­ally across mul­tiple ma­chines in your cluster and to sup­port com­plex dy­namic work­flows with thou­sands of tasks.

Production-ready

Steep is very re­li­able and has been used in pro­duc­tion for many years to ex­ecute work­flows from vari­ous do­mains. The source code has a very high test cov­er­age.

Open-Source

Did I men­tion Steep is free and Open-Source? It is re­leased un­der the Apache Li­cense, Ver­sion 2.0. The code can be found in the Git­Hub re­pos­it­ory.

Version 5.0.0

Today, we also re­leased the new ver­sion 5.0.0 of Steep with the fol­low­ing new fea­tures:

  • Pro­cess chains are now ex­ecuted in the or­der they have been ad­ded to the re­gistry.
  • New cap­ab­il­ity-based schedul­ing al­gorithm (see be­low)
  • Do not keep tem­por­ary pro­cess chain res­ults in memory if not needed
  • Al­low users to dis­able sched­uler
  • Do not log er­rors while try­ing to con­nect to new VM via SSH
  • Add pos­sib­il­ity to spe­cify a min­imum num­ber of VMs per setup
  • Al­low al­tern­at­ive setups with sim­ilar provided cap­ab­il­it­ies to be spe­cified
  • New pro­cess chain ad­apter plu­gins

One of the high­lights, is the new cap­ab­il­ity-based schedul­ing al­gorithm. With the old al­gorithm, work­flow ex­e­cu­tion could stall if there was a pro­cess chain that could not be ex­ecuted be­cause of a miss­ing agent, even if there were other agents that would have been able to ex­ecute the re­main­ing pro­cess chains. The new al­gorithm ex­ecutes pro­cess chains in the or­der they were ad­ded to the re­gistry and al­ways fetches as many of them as pos­sible from the re­gistry if there are enough agents avail­able that can ex­ecute them. Pro­cess chains that can­not be ex­ecuted be­cause they re­quire cap­ab­il­it­ies none of the agents can provide are skipped but re­sumed im­me­di­ately as soon as an agent with match­ing cap­ab­il­it­ies joins the cluster.

Publications

Steep was ini­tially de­veloped within the re­search pro­ject “IQmu­lus” (A High-volume Fu­sion and Ana­lysis Plat­form for Geo­spa­tial Point Clouds, Cov­er­ages and Volu­met­ric Data Sets) fun­ded from the 7th Frame­work Pro­gramme of the European Com­mis­sion, call iden­ti­fier FP7-ICT-2011-8, un­der the Grant agree­ment no. 318787 from 2012 to 2016. It was pre­vi­ously called the ‘IQmu­lus Job­Man­ager’ or just the ‘Job­Man­ager’. Un­der this name, it has ap­peared in at least the fol­low­ing pub­lic­a­tions:

Krämer, M. (2018). A Mi­croservice Ar­chi­tec­ture for the Pro­cessing of Large Geo­spa­tial Data in the Cloud (Doc­toral dis­ser­ta­tion). Tech­nis­che Uni­versität Darm­stadt. ht­tps://​doi.org/​10.13140/​RG.2.2.30034.66248
Böhm, J., Bredif, M., Gi­er­linger, T., Krämer, M., Linden­bergh, R., Liu, K., … Sir­ma­cek, B. (2016). The IQmu­lus Urban Show­case: Auto­matic Tree Clas­si­fic­a­tion and Iden­ti­fic­a­tion in Huge Mo­bile Map­ping Point Clouds. IS­PRS - In­ter­na­tional Archives of the Pho­to­gram­metry, Re­mote Sens­ing and Spa­tial In­form­a­tion Sci­ences, XLI-B3, 301–307. ht­tps://​doi.org/​10.5194/​is­prs-archives-XLI-B3-301-2016
Krämer, M., & Sen­ner, I. (2015). A mod­u­lar soft­ware ar­chi­tec­ture for pro­cessing of big geo­spa­tial data in the cloud. Com­puters & Graph­ics, 49, 69–81. ht­tps://​doi.org/​10.1016/​j.cag.2015.02.005

Documentation and getting started

I’m cur­rently work­ing on a com­pre­hens­ive web page de­scrib­ing the fea­tures of Steep and how you can ex­ecute work­flows with it. I will keep you up­dated here on this site and let you know when the web page is ready.


Posted by Michel Krämer
on February, 6th 2020.