How to combine flags from multiple operations in pipelines to enable parameter searches across multiple operations?

Let say I have 2 operations preprocessing and training.
Each has a series of flags that I want to do a hyper parameter search over.
Here is a rough idea of what my operations would look like

preprocessing:
  flags:
    a:
    b:
training:
  flags:
    c:
    d:

Neglecting the practicality of this scenario, I want to run my preprocessing operation every time I run train. Is there a way to achieve this with pipelines or a similar method? I know I could do

mypipeline:
  flags:
    a: [1,2,3]
    b: [4,5,6]
    c: [12,13,14]
    d: [15,16,17]
  steps:
    - preprocessing a=${a} b=${b}
    - train c=${c} d=${d}

and this would solve my problem, but when both operations have a lot of parameters it would end up requiring a lot of copying and pasting between the pipeline and operation flags. Is there a better/more efficient way to achieve this? I basically want the pipline to “inherit” the flags of the steps its performing if possible. The documentation seems to hint at something like this

But when I create a pipeline with no flags and attempt to pass a flag which is defined by one of the steps I get an error saying that the flag does not exists. Is what I am trying to do possible with Guild?

Thanks.

I started to think the copy/paste solution wouldn’t actually solve my problem because I also needed an additional run that I could only run once before preprocessing and train. Since it is outside the pipeline my steps cannot access it without isolate-runs. However this requires using the run: operation attribute in my step

steps:
  - run: preprocessing
    isolate-runs: False
    flags: 
      a:
      b:

vs

steps:
  - preprocessing a=${a} b=${b}

There is no way to transfer the pipeline flags to the step flags as far as I know. However it turns out you can do this instead.

steps:
  - run: "preprocessing a=${a} b=${b}"
    isolate-runs: False

I found this by accident and couldn’t find anything regarding it in the documentation. I figured I would leave it here incase someone has a similar issue. A reference to where this behavior is mentioned would be helpful.

edit: corrected third guild file bit

In this case, you can pass the steps flag values along this way:

steps:
  - run: preprocessing
    flags:
      a: ${a}
      b: ${b}

Though what you’ve done is also fine (note in your example you’ve left off the $ in the flag ref - I know what you meant :wink: )

I’ll reply to your previous post separately.

1 Like

Unfortunately you have to duplicate the flag defs for the pipeline operation. This is on the near term road map to be addressed.

How many flags are you rolling up to the pipeline op?

Right now I have around 10 and could potentially use more in the future.
Being able to access the step flags from the pipeline would defiantly help a lot.

As another solution I tried using $include and was also having issues. I made two configs for train and preprocessing and then included both in the pipeline

    run_train:
      flags:
        $include: preprocessing-flags
        $include: train-flags
      steps:
        - run: preprocessing
          flags:
          a: ${a}
          b: ${b}
        - train
          flags:
          c: ${c}
          d: ${d}

but it only detected the flags for the last $include attribute. In this case train-flags. Can only one $include be used or am I possibly doing something wrong?

Provide a list for the $include attr:

run_train:
  flags:
    $include:
      - preprocessing-flags
      - train-flags

Thanks, that did it.

1 Like

Is this still on the roadmap?

It is - sorry this is indeed taking too long to get in. Looking at this for the next major release (0.8) along with DvC integration, generalized run query support (there’s an excellent PR waiting for merge related to tags but I want to see generalized flexible query support across all attrs).