Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink how PipelineBuilder.combine 'inputs' and 'outputs' arguments #123

Open
elonp opened this issue Apr 20, 2024 · 3 comments
Open

Rethink how PipelineBuilder.combine 'inputs' and 'outputs' arguments #123

elonp opened this issue Apr 20, 2024 · 3 comments

Comments

@elonp
Copy link
Contributor

elonp commented Apr 20, 2024

Inputs:

Current behaviour:

If inputs argument is not provided:

All sub-pipeline inputs that are not connected to some other sub-pipeline's output are exposed as inputs to the new pipeline. They are grouped by name, so if two sub-pipelines have an (unconnected) input called INPUT1 the new pipeline will have an input called INPUT1 wired to the two sub-pipelines.

If inputs argument is provided:

It needs to be a dictionary mapping name (str) to list of sub-pipeline's inputs. Each unconnected sub-pipeline input must appear exactly once.

Suggestion:

Remove the inputs argument, and stay with the existing default behaviour. All desired outcomes can be achieved by renaming inputs in the sub-pipelines before calling combine. That's the way I am building pipelines for about a year now.

Outputs:

Current behaviour:

If outputs argument is not provided:

All sub-pipeline outputs that are not connected to some other sub-pipeline's input are exposed as outputs of the new pipeline. If more than one sub-pipeline has an (unconnected) output of a particular name, a concatenation stage is added taking all those outputs and exposing a single output that is a list of the original ones.

If outputs argument is provided:

It needs to be a dictionary mapping name (str) to list of sub-pipeline's outputs. If the list is > 1 then a concatenation stage is added as above. The provided dictionary defines the outputs, and is not an addition to them.

Suggestion 1:

Remove the outputs argument. Add a drop_output and clone_output methods to Pipeline. All desired behaviour can be achieve by a combination of drops, clones and renames.

Suggestion 2:

Replace the outputs argument with an additional_outputs argument. Add a drop_output method to Pipeline.

Suggestion 3 (orthogonal to suggestions 1 and 2):

Remove the auto-concatenation of outputs when multiple sub-pipelines have an unconnected output of the same name. Instead error. If users want a concatenation stage, they can add it themselves. The auto-concatenation is confusing and not very helpful because it is not possible to know which of the outputs came from which sub-pipeline.

@elonp
Copy link
Contributor Author

elonp commented Apr 20, 2024

@jzazo

@ms-lolo
Copy link
Collaborator

ms-lolo commented Dec 12, 2024

@elonp / @jzazo is this issue stale? should we close it in order to focus on rats-apps/devtools? just trying to tidy up a bit and do some tiny bit of planning here.

@jzazo
Copy link
Collaborator

jzazo commented Dec 12, 2024

it is stale, but I would not close it? If we refactor processors to the new di container structure, we may want to do some changes to the api... This is something we didn't make a decision on, but it will be long until we do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants