Steep 5.7.0

I’m very happy to an­nounce that the new ver­sion of my sci­entific work­flow man­age­ment sys­tem Steep has just been re­leased. The new ver­sion in­tro­duces live pro­cess chain logs, im­proved VM man­age­ment, and many other things (see com­plete list be­low). The ver­sion has been thor­oughly tested in prac­tise over the last couple of months.

Steep is a sci­entific work­flow man­age­ment sys­tem that can ex­ecute data-driven work­flows in the Cloud. It is very well suited to har­ness the pos­sib­il­it­ies of dis­trib­uted com­put­ing in or­der to par­al­lel­ise work and to speed up your data pro­cessing work­flows no mat­ter how com­plex they are and re­gard­less of how much data you need to pro­cess. Steep is an open-source soft­ware de­veloped at Fraunhofer IGD. You can down­load the bin­ar­ies and the source code of Steep from its Git­Hub re­pos­it­ory.

Log file configuration

It is now pos­sible to con­fig­ure Steep’s main log file and to en­able pro­cess chain logs. For this, add the fol­low­ing lines to your steep.yaml file:

steep:
  logs:
    level: INFO
    main:
      enabled: true
      logFile: logs/steep.log
      dailyRollover:
        enabled: true
        maxDays: 7
        maxSize: 104857600  # 100 MB
    processChains:
      enabled: true
      path: logs/processchains
      groupByPrefix: 3

The main log file and pro­cess chain logs can be enabled sep­ar­ately. dailyRollover con­trols if the main log file should be split into smal­ler files on a daily basis. It this fea­ture is en­abled, the main log file will be re­named every day. The file name of old log will be based on the value of logFile and the file’s date in the form YYYY-MM-DD (e.g. logs/steep.2020-11-19.log). maxDays and maxSize con­trol the max­imum num­ber as well as the max­imum total size of all log files. Log files bey­ond these lim­its will auto­mat­ic­ally be de­leted.

You can also en­able sep­ar­ate pro­cess chain logs that will re­cord the out­put of pro­cess chain ex­ecut­ables. Steep cre­ates a new log file per ex­ecuted pro­cess chain in the given path. Since this can lead to a high num­ber of files, you can spe­cify a groupByPrefix. If this pre­fix has a value greater than 0, the pro­cess chain log files will be grouped by pre­fix in sub­dir­ect­or­ies un­der the con­figured path. For ex­ample, if groupByPrefix equals 3, Steep will cre­ate a sep­ar­ate sub­dir­ect­ory for all pro­cess chains whose ID start with the same three char­ac­ters. The name of this sub­dir­ect­ory will be these three char­ac­ters. The pro­cess chains apomaokjbk3dmqovemwa and apomaokjbk3dmqovemsq will be put into a sub­dir­ect­ory called apo, and the pro­cess chain ao344a53oyoqwhdelmna will be put into ao3. Note that in prac­tice, 3 is a reas­on­able value, which will cre­ate a new dir­ect­ory about every day. A value of 0 dis­ables group­ing. The de­fault value is 0.

Pro­cess chain logs can also be ac­cessed through the new HTTP en­d­point /logs/processchains/<id>.

Real-time process chain logs

If pro­cess chain logs are en­abled, you will be able to fol­low the out­put of ex­ecut­ables in Steep’s web UI. Open the UI in your browser and nav­ig­ate to the pro­cess chain you want to mon­itor. You will see a new “Log” sec­tion like in the im­age be­low. The log auto­mat­ic­ally up­dates while the pro­cess chain is be­ing ex­ecuted, so you can fol­low up­dates in real time.

Improved VM management

It is now pos­sible to at­tach ad­di­tional volumes to VMs cre­ated by Steep’s cloud man­ager. Also, the cloud man­ager can now cre­ate mul­tiple VMs in par­al­lel if you spe­cify the maxCreateConcurrent para­meter in the VM’s setup. Note that the cloud man­ager will only cre­ate as many VMs as ne­ces­sary for the work­flows cur­rently be­ing ex­ecuted, which helps save re­sources.

Other new features

Here’s a list of other note­worthy im­prove­ments and bug fixes:

  • New runtime plug­in in­ter­face, which al­lows the out­put of ex­ecut­ables to be logged im­me­di­ately. The old in­ter­face is still avail­able but de­prec­ated. It will be re­moved in Steep 6.0.0.
  • Sort ser­vices in UI al­pha­bet­ic­ally
  • Log caller loc­a­tion on retry

Maintenance

  • Up­date Gradle to 6.8.3
  • Up­date Kot­lin to 1.4.30
  • Minor de­pend­ency up­dates
  • Up­date test­con­tain­ers to fix CI build

Bug fixes

  • Do not retry an ex­ecut­able if the pro­cess chain has been can­celled
  • Al­low pro­cess chain to be can­celled even if agent cur­rently waits for retry

Posted by Michel Krämer
on March, 3rd 2021.