Steep 6.0.0

I’m very proud to an­nounce a new ma­jor ver­sion of my work­flow man­age­ment sys­tem Steep! This is the biggest re­lease so far. It con­tains a lot of cool new fea­tures, bug fixes, but also some break­ing changes. Make sure to have a look at the de­tailed list be­low and read the doc­u­ment­a­tion on the Steep web­site.

Steep is a sci­entific work­flow man­age­ment sys­tem that can ex­ecute data-driven work­flows in the Cloud. It is very well suited to har­ness the pos­sib­il­it­ies of dis­trib­uted com­put­ing in or­der to par­al­lel­ise work and to speed up your data pro­cessing work­flows no mat­ter how com­plex they are and re­gard­less of how much data you need to pro­cess. Steep is an open-source soft­ware de­veloped at Fraunhofer IGD. You can down­load the bin­ar­ies and the source code of Steep from its Git­Hub re­pos­it­ory.

Highlights in this release

  • Im­proved work­flow syn­tax
  • Im­proved work­flow val­id­a­tion
  • Eager pro­cess chain gen­er­a­tion (im­proved par­al­lel­iz­a­tion)
  • Named work­flows
  • Work­flow pri­or­it­ies
  • Full-text search
  • Im­proved per­form­ance (Data­base + Web UI + HTTP + Event bus)

New features

  • Im­proved work­flow model:
    • Sup­port in­put para­met­ers with val­ues in­stead of pre-defined vari­ables
    • Al­low out­put vari­ables to be used without de­clar­ing them in vars
  • En­able full par­al­lel­isa­tion of pro­cess chains and for-each ac­tions:
    • Pro­cess chains are now gen­er­ated eagerly. This means as soon as the first res­ults of run­ning pro­cess chains are avail­able, new pro­cess chains can be gen­er­ated.
    • For-each ac­tions are now eagerly un­rolled. This means that ex­e­cu­tion of for-each ac­tions that de­pend on the out­put of other ac­tions or that are re­curs­ive can now start even if the res­ults are only par­tially avail­able yet.
  • Add pos­sib­il­ity to pri­or­it­ize work­flows
  • Add power­ful full-text search for work­flows and pro­cess chains
  • Im­proved work­flow val­id­a­tion:
    • Dis­al­low vari­ables to be used more than once as out­put
    • Make sure enu­mer­at­ors can­not be re­used as enu­mer­at­ors or out­puts
    • Val­id­ate if vari­able val­ues are ac­cessed within the right scope
    • Dis­play path to er­ror in val­id­a­tion res­ult
  • Add de­pend­ency in­jec­tion mech­an­ism for plug­in in­ter­faces
  • Add pro­cess chain con­sist­ency checker plug­in
  • Add plug­in ver­sions
  • Add more Pro­meth­eus met­rics
  • Dis­play timeout policies in UI
  • Dis­play work­flow name in UI
  • Dis­play ori­ginal YAML source on work­flow de­tail page in UI if avail­able
  • Add but­ton to cre­ate new work­flow to UI
  • Im­proved per­form­ance:
    • Data­base re­quests (less re­quests and faster quer­ies)
    • HTTP API
    • Web UI
    • Cluster com­mu­nic­a­tion
  • Im­ple­ment timeout for cre­at­ing a VM
  • Com­press large mes­sages sent over the event bus
  • Add short­cut but­ton to cre­ate new work­flow from scratch to UI
  • Add more para­met­ers for cluster con­fig­ur­a­tion:
    • Add pos­sib­il­ity to con­fig­ure place­ment group name
    • Add pos­sib­il­ity to make a Steep in­stance a Hazel­cast lite mem­ber
  • Im­prove re­li­ab­il­ity by in­creas­ing backup count of dis­trib­uted Hazel­cast data struc­tures

Breaking changes

  • Hazel­cast has been up­dated. Steep 6 in­stances can­not con­nect to Steep 5.x in­stances. You have to re­start your whole cluster dur­ing up­date.
  • Work­flow API ver­sion 3.x has been re­moved. Please up­grade your work­flows to API ver­sion 4.x.
  • All model prop­er­ties are now camel case. For ex­ample, data_type in ser­vice metadata has been re­named to dataType. The same ap­plies to prop­er­ties such as required_capabilities or file_suffix. Please refer to the doc­u­ment­a­tion on the Steep web­site for more in­form­a­tion.
  • The de­prec­ated prop­erty supportedServiceId in plug­in descriptors has been re­moved. It was re­placed by supportedServiceIds in earlier ver­sions already.
  • De­prec­ated plug­in in­ter­faces have been re­moved
  • Executable.serviceId is now man­dat­ory. Make sure to up­date your plu­gins if you cre­ate Executable ob­jects (this par­tic­u­larly ap­plies to pro­cess chain ad­apters)
  • The de­prec­ated con­fig­ur­a­tion prop­erty onlyTraverseDirectoryOutputs has been re­moved. Steep now al­ways only tra­verses out­put dir­ect­or­ies if the dataType in the ser­vice metadata is directory. All other data types will not be tra­versed but dir­ectly passed to the sub­sequent ser­vice.
  • Re­move de­prec­ated store ac­tions and ac­tion para­met­ers
  • Con­fig­ur­a­tion items de­not­ing peri­ods of time have been re­named. All items must now be spe­cified as dur­a­tions. For ex­ample, lookupIntervalMilliseconds: 2000 be­comes lookupInterval: 2s. Refer to the doc­u­ment­a­tion for an over­view of all con­fig­ur­a­tion items.
  • Con­fig­ur­a­tion item steep.http.cors.maxAge has been re­named to steep.http.cors.maxAgeSeconds to make clear that it has to be spe­cified in seconds and to be in line with the cor­res­pond­ing CORS HTTP header.

Bug fixes

  • Fix is­sue where some Docker con­tain­ers were not killed when the work­flow was can­celled
  • Do not fail if plug­in con­fig­ur­a­tion file is empty
  • Fix read­ing ar­bit­rar­ily large GridFS files from Mon­goDB
  • Do not cre­ate more VMs if there already are enough provid­ing a given re­quired cap­ab­il­ity set
  • Fix in­ter­mit­tent crashes in UI if con­nec­tion to event bus was lost
  • When look­ing for orphaned pro­cess chains, the sched­uler does not send a mes­sage to it­self any­more

Maintenance

  • Up­grade Vert.x to 4.3.0
  • Switch to Zulu Open­JDK Docker base im­age
  • In­stall se­cur­ity patches on Docker im­age build
  • Up­date other de­pend­en­cies

Posted by Michel Krämer
on June, 27th 2022.