I’m very happy to announce that the new version of my scientific workflow management system Steep has just been released. The new version introduces live process chain logs, improved VM management, and many other things (see complete list below). The version has been thoroughly tested in practise over the last couple of months.
Steep is a scientific workflow management system that can execute data-driven workflows in the Cloud. It is very well suited to harness the possibilities of distributed computing in order to parallelise work and to speed up your data processing workflows, no matter how complex they are and regardless of how much data you need to process. Steep is an open-source software developed at Fraunhofer IGD. You can download the binaries and the source code of Steep from its GitHub repository.
Log file configuration
It is now possible to configure Steep’s main log file and to enable process
chain logs. For this, add the following lines to your
steep: logs: level: INFO main: enabled: true logFile: logs/steep.log dailyRollover: enabled: true maxDays: 7 maxSize: 104857600 # 100 MB processChains: enabled: true path: logs/processchains groupByPrefix: 3
The main log file and process chain logs can be
dailyRollover controls if the main log file should be split into smaller
files on a daily basis. It this feature is enabled, the main log file will be
renamed every day. The file name of old log will be based on the value of
logFile and the file’s date in the form
maxSize control the maximum
number as well as the maximum total size of all log files. Log files beyond
these limits will automatically be deleted.
You can also enable separate process chain logs that will record the output of
process chain executables. Steep creates a new log file per executed process
chain in the given
path. Since this can lead to a high number of files, you
can specify a
groupByPrefix. If this prefix has a value greater than
process chain log files will be grouped by prefix in subdirectories under the
path. For example, if
3, Steep will
create a separate subdirectory for all process chains whose ID start with the
same three characters. The name of this subdirectory will be these three
characters. The process chains
will be put into a subdirectory called
apo, and the process chain
ao344a53oyoqwhdelmna will be put into
ao3. Note that in practice,
3 is a
reasonable value, which will create a new directory about every day. A value
0 disables grouping. The default value is
Process chain logs can also be accessed through the new HTTP endpoint
Real-time process chain logs
If process chain logs are enabled, you will be able to follow the output of executables in Steep’s web UI. Open the UI in your browser and navigate to the process chain you want to monitor. You will see a new “Log” section like in the image below. The log automatically updates while the process chain is being executed, so you can follow updates in real time.
Improved VM management
It is now possible to attach additional volumes to VMs created by Steep’s
cloud manager. Also, the cloud manager can now create multiple VMs in
parallel if you specify the
maxCreateConcurrent parameter in the VM’s setup.
Note that the cloud manager will only create as many VMs as necessary for the
workflows currently being executed, which helps save resources.
Other new features
Here’s a list of other noteworthy improvements and bug fixes:
- New runtime plugin interface, which allows the output of executables to be logged immediately. The old interface is still available but deprecated. It will be removed in Steep 6.0.0.
- Sort services in UI alphabetically
- Log caller location on retry
- Update Gradle to 6.8.3
- Update Kotlin to 1.4.30
- Minor dependency updates
- Update testcontainers to fix CI build
- Do not retry an executable if the process chain has been cancelled
- Allow process chain to be cancelled even if agent currently waits for retry
Posted by Michel Krämer
on 3 March 2021
Sudocle: A modern web app for Sudoku
As a huge fan of Sudoku, I’m extremely happy to announce the first version of Sudocle, a web app inspired by “Cracking the Cryptic”. The app is lightweight and has a clean look, which makes solving Sudoku puzzles more fun than ever!
The new version of my scientific workflow management system highlights automatic retrying of individual services, multiple agents per Steep instance, an optimised scheduling algorithm, and many other new features.
The new version of the scientific workflow management system contains many new features including an improved workflow syntax, better parallelization, workflow priorities, and full-text search. It also fixes a few bugs.
Steep - Run Scientific Workflows in the Cloud
I’m thrilled to announce that the workflow management system I’ve been working on for the last couple of years is now open-source! Read more about Steep and its features in this blog post.
I’m thrilled to announce the new version of the scientific workflow management system Steep. This release contains many features including the possibility to resume process chains after a scheduler instance has crashed.