Dynamically generated parameters in pipeline

Conceptually I have a two stage pipeline. Where the first stage generates a set of flags (“hyper-hyper parameters”). Then in the second stage I want to combine those with a set of hyper parameters to optimize. The challenge is that since they’re created dynamically… I can’t know ahead of time how many there are.

I can do it manually like this

guild run train x='[1,2,3] y='[1,2,3]'' @bigbatch.csv

What I would like to do is for the train step to use a generated bigbatch.csv from the upstream pipeline

I’ve attached what I think it should look like at the guild.yml level

train:
  description: Sample training script
  flags-import: all
  requires:
    - operation: bigbatch
bigbatch:
  description: make file bigbatch.csv

This gives me a symlink to the correct file called bigbatch.csv in the “train folder” after the guild train operation. However when I use the “@” batch notation the bigbatch.csv is taken from my cwd. Is there any way to reference batch parameters in the guild.yml?

This is a bit of a hack but I think could be an approach that works for you:

Take a look and let me know if you have any questions or run into issues.

This looks really promising and I can get guild train-batch to run. My challenge is that I would like to be able to do something like…

guild train-batch x='[1,2,3]' (or any HPO-type syntax) to do the outer product of the bigbatch with the HPO.

You can pass through flag values in a command spec using ${FLAG_NAME) so you could parameterize a list of values this way:

Note though that when you run train-batch you need to quote the string arg to tell Guild the value is a string and not a list. Like this:

guild run train-batch x="'[5,6,7,8]'"
You are about to run train-batch
  bigbatch: ef0d8d741ea94bc8a1202f0815812b6e
  x: '[5,6,7,8]'
Continue? (Y/n)