"Looping" over operations and requirements

I have a dependency tool chain that looks something like this example:

train:
  requires:
  - operation: prepare-data

prepare-data:
   requires:
  - operation: train

This is for a problem where I basically initialize learning from some manual annotations and then bootstrap the learning process by generating labels from the trained model. The whole process looks something like this

manual_labels -> train_model_0 -> model_0_labels -> train_model_1 -> .... -> model_n_labels -> train_model_n

I am wondering how to handle this in guild. Ideally I could specify n number of iterations in this process or maybe some termination criteria when some metric doesn’t improve any more.

You need a third operation, one that starts the chain off.

bootstrap-data: {}

generate-data:
  requires:
    - operation: train-model

train-model:
  requires:
    - operation: bootstrap-data|generate-data

As far as running this N times, I would just create a bash script. Guild’s current pipeline support doesn’t support looping like that. I will in time - that’s on the road map. Same with early stopping.

You can wrap that bash script in a Guild operation to run and capture results. Use the exec operation attribute rather than main. See Bash example for hints.

If you have any questions or run into problems, update here with details and I can help.

1 Like