Efficient scheduling of workflow actions in the cloud

My pa­per “Ef­fi­cient Schedul­ing of Sci­entific Work­flow Ac­tions in the Cloud Based on Re­quired Cap­ab­il­it­ies” has just been pub­lished in Spring­er’s Com­mu­nic­a­tions in Com­puter and In­form­a­tion Sci­ence book series (CCIS).

Dis­trib­uted sci­entific work­flow man­age­ment sys­tems pro­cessing large data sets in the Cloud face the fol­low­ing chal­lenges: (a) work­flow tasks re­quire dif­fer­ent cap­ab­il­it­ies from the ma­chines on which they run, but at the same time, the in­fra­struc­ture is highly het­ero­gen­eous, (b) the en­vir­on­ment is dy­namic and new re­sources can be ad­ded and re­moved at any time, (c) sci­entific work­flows can be­come very large with hun­dreds of thou­sands of tasks, (d) faults can hap­pen at any time in a dis­trib­uted sys­tem.

In this pa­per, I present a soft­ware ar­chi­tec­ture and a cap­ab­il­ity-based schedul­ing al­gorithm that cover all these chal­lenges in one design. My ar­chi­tec­ture con­sists of loosely coupled com­pon­ents that can run on sep­ar­ate vir­tual ma­chines and com­mu­nic­ate with each other over an event bus and through a data­base. The schedul­ing al­gorithm matches cap­ab­il­it­ies re­quired by the tasks (e.g. soft­ware, CPU power, main memory, graph­ics pro­cessing unit) with those offered by the avail­able vir­tual ma­chines and as­signs them ac­cord­ingly for pro­cessing. My ap­proach util­ises heur­ist­ics to dis­trib­ute the tasks evenly in the Cloud. This re­duces the over­all run time of work­flows and makes ef­fi­cient use of avail­able re­sources. My schedul­ing al­gorithm also im­ple­ments op­tim­isa­tions to achieve a high scalab­il­ity. I per­form a thor­ough eval­u­ation based on four ex­per­i­ments and test if my ap­proach meets the chal­lenges men­tioned above.

The pa­per fin­ishes with a dis­cus­sion, con­clu­sions, and fu­ture re­search op­por­tun­it­ies. An im­ple­ment­a­tion of my al­gorithm and soft­ware ar­chi­tec­ture is pub­licly avail­able with the open-source work­flow man­age­ment sys­tem Steep.

Reference

Krämer, M. (2021). Ef­fi­cient Schedul­ing of Sci­entific Work­flow Ac­tions in the Cloud Based on Re­quired Cap­ab­il­it­ies. In S. Ham­moudi, C. Quix, & J. Bern­ardino (Eds.), Data Man­age­ment Tech­no­lo­gies and Ap­plic­a­tions. Com­mu­nic­a­tions in Com­puter and In­form­a­tion Sci­ence (Vol. 1446, pp. 32–55). Springer. ht­tps://​doi.org/​10.1007/​978-3-030-83014-4_2

Download

Ac­cord­ing to Spring­er’s self-archiv­ing policy, you may down­load the manuscript pre-print here. The fi­nal au­then­tic­ated ver­sion is avail­able on the pub­lish­er’s web­site.


Posted by Michel Krämer
on July, 23rd 2021.