"Looping" over operations and requirements

copah · August 13, 2020, 7:00pm

I have a dependency tool chain that looks something like this example:

train:
  requires:
  - operation: prepare-data

prepare-data:
   requires:
  - operation: train

This is for a problem where I basically initialize learning from some manual annotations and then bootstrap the learning process by generating labels from the trained model. The whole process looks something like this

manual_labels -> train_model_0 -> model_0_labels -> train_model_1 -> .... -> model_n_labels -> train_model_n

I am wondering how to handle this in guild. Ideally I could specify n number of iterations in this process or maybe some termination criteria when some metric doesn’t improve any more.

garrett · August 13, 2020, 8:45pm

You need a third operation, one that starts the chain off.

bootstrap-data: {}

generate-data:
  requires:
    - operation: train-model

train-model:
  requires:
    - operation: bootstrap-data|generate-data

As far as running this N times, I would just create a bash script. Guild’s current pipeline support doesn’t support looping like that. I will in time - that’s on the road map. Same with early stopping.

You can wrap that bash script in a Guild operation to run and capture results. Use the exec operation attribute rather than main. See Bash example for hints.

If you have any questions or run into problems, update here with details and I can help.

Topic		Replies	Views
How can I define models in guild and run them against different training procedures? General	1	538	March 22, 2022
Operations as dependencies during checks Troubleshooting	1	471	May 7, 2021
Pipeline depending on multiple of the same operation Troubleshooting	2	338	July 11, 2022
Optional operation/run dependencies RFC	5	347	October 20, 2022
Pipelines Concepts	0	2866	June 12, 2020

"Looping" over operations and requirements

Related topics