You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To really get the discussion going that we started a few weeks ago:
It would be beneficial to re-use the cloud_ilastik architecture for running jobs on different target systems (local, slurm, etc.) for my implementations of scalable (3D) segmentation and image analysis algorithms, currently available here https://github.com/constantinpape/cluster_tools.
Briefly, my current implementation has three issues:
To implement a task for a given target, I use a mixin pattern. E.g. to implement an ilastik slurm prediction task, this would look like class IlastikPredictionSlurm(IlastikPredictionBase, SlurmTask), see this for details. This approach has the mdrawback that it does not scale well to new computation targets because for each existing task one needs to define a new mixin subclass.
Monitoring and logging are convoluted (it's fine for me, because I know what's happening, but it's not easily usable for anyone else). This is not really tied to 1, but it would be great to implement a clean solution once and re-use it.
Re-running a partially failed job is very cumbersome and it's usually easier to delete the (intermediate) result and rerun the whole job.
The advantages of using the cloud_ilastik implementation: 1. is solved more elegantly already.
I don't know how/if you have tackled 2 and 3 already, but at least moving to a more common code-base would decrease redundant work. Also, this would allow cloud_ilastik to use the scalable algorithms I have implemented already.
This came up in the context of our more recent project for processing high-throughput screening data, where @Tomaz-Vieira had a closer look at the implementation: sciai-lab/batchlib#5. Since then, I have simplified the design, because we don't really need a multi-target solution. But in general this issue is relevant for batch processing of 2d image as well. Also, for this project I have implemented a solution for issue 3 that works well for images and could probably be extended to nD chunked data, see this for details.
More concretely, the questions I would like to explore:
How can we integrate cloud_ilastik and the algorithms in cluster_tools? Can I just use cloud_ilastik as is or is it better to implement a common parent library?
Are there existing solutions / libraries we can offload some work to? (I will open a follow-up issue on this soon.)
The text was updated successfully, but these errors were encountered:
To really get the discussion going that we started a few weeks ago:
It would be beneficial to re-use the
cloud_ilastik
architecture for running jobs on different target systems (local, slurm, etc.) for my implementations of scalable (3D) segmentation and image analysis algorithms, currently available here https://github.com/constantinpape/cluster_tools.Briefly, my current implementation has three issues:
class IlastikPredictionSlurm(IlastikPredictionBase, SlurmTask)
, see this for details. This approach has the mdrawback that it does not scale well to new computation targets because for each existing task one needs to define a new mixin subclass.The advantages of using the
cloud_ilastik
implementation: 1. is solved more elegantly already.I don't know how/if you have tackled 2 and 3 already, but at least moving to a more common code-base would decrease redundant work. Also, this would allow
cloud_ilastik
to use the scalable algorithms I have implemented already.This came up in the context of our more recent project for processing high-throughput screening data, where @Tomaz-Vieira had a closer look at the implementation: sciai-lab/batchlib#5. Since then, I have simplified the design, because we don't really need a multi-target solution. But in general this issue is relevant for batch processing of 2d image as well. Also, for this project I have implemented a solution for issue 3 that works well for images and could probably be extended to nD chunked data, see this for details.
More concretely, the questions I would like to explore:
cloud_ilastik
and the algorithms incluster_tools
? Can I just usecloud_ilastik
as is or is it better to implement a common parent library?The text was updated successfully, but these errors were encountered: