Guild steps and pipeline - reuse same run

Is there a way to have a pipeline reuse the same run?

An example part of my guild file:

- model: segmentation
  operations:
    train_and_convert_pipeline:
      steps:
        - train_model batch_size=${batch_size} dryrun=${dryrun} num_epochs=${num_epochs} sample_ids=${sample_ids}
        - utils:convert_to_onnx
      flags:
        $include: train-flags

This will produce a run for the train_model op and a run for the utils:convert_to_onnx op. Is there a way to have the utils:convert_to_onnx reuse the same run as train_model? The utils:convert_to_onnx essentially just saves an additional file, so would be nice to not have to keep track of an entire separate run for this.

I donā€™t think I understand ā€œreuse a runā€.

Guild runs use files as inputs. Runs generate files and those files can be used by subsequent runs. So any files generated by train_model can be made available to convert_to_onnx.

You can read about this in Dependencies.

But you may be asking about something else so Iā€™m not sure my answer here is helpful.

Let me try to rephrase. The above pipeline will create two guild runs, one for the train_model op and one for the utils:convert_to_onnx. The utils:convert_to_onnx will use train_model run as input and that all works as expected.

I see the utils:convert_to_onnx as a ā€˜patchā€™ to the train_model run, i.e. in my mental model they should be the same run/experiment and not two separate runs.

Consider for an example a situation where I want to share the onnx model produced from the utils:convert_to_onnx with some team members. In this case I have two share both runs in order for them to have all info about what generated the onnx model. Or at least that how I currently understand.

Did that clarify?

I understand now ā€” thank you for the clarification!

Guild is not really setup to do something like this. In Guild, a run, once completed, is informally considered read-only. Guild does not currently enforce this read-only state, but I think it should. The thinking is that, once a run is completed, itā€™s set and should not later be changed. Future releases of Guild will likely formally support this via these mechanisms:

  • Set read-only file status for the run directory and run files
  • Generate a digest for the read-only run
  • Support checking a run against the digest to detect changes

These are all important considerations for reproducibility and audability.

However, the patching scenario that you describe is quite common ā€” and generation of a runnable artifact is a good example. Another example might be model compression, quantization, etc.

From Guildā€™s point of view, these patch operations should be separate runs. This keeps the upstream runs immutable and separates any newly generated artifacts. If the downstream operation is meant to modify an upstream file, it should use a copy dependency and modify its own copy of the upstream file.

upstream: {}     # generates some file foo.txt
downstream:      # compresses foo.txt
  requires:
    - op: upstream
      select: foo.txt
      target-type: copy

In Guild 0.7.x the default target type is link. To copy you need to explicitly use the copy target type as per the example above. This will change in 0.8 so that copy is the default. If you want to link, youā€™ll need to use link. In that case, the link will be read-only ā€” again, using the rationale above.

Now, all this said, Guild does support a --restart option, which is specifically designed to re-run an operation from within a run directory. This is really intended for use with terminated or error status but works just as well with completed status. The use case this addresses is the common case of restarting a run that stopped early or failed ā€” e.g. to train more or to fix a bug without having to restart a run from scratch.

To your case, I would first consider the Guild approach I describe above, where patches are really just additional runs. Think of this like a copy-on-write file system, where changes are implemented as additional transformations rather than in-place edits. Docker images e.g. work this way.

If you strongly prefer to edit the run files in place, you still need a second run. You can link to the files that you want to modify and then delete the patch run afterward. However, I think this is not ideal. The patch is a meaningful operation, which I think you should record. The second run formally captures the patch operation, including the source code used, flags, results, etc. If you delete this run, you lose that record.

Garrett, thank you for this answer. This works great.

There is one particular issue I am running into when using patching operations. In the example you provided:

upstream: {}     # generates some file foo.txt
downstream:      # compresses foo.txt
  requires:
    - op: upstream
      select: foo.txt
      target-type: copy

Imagine that upstream is a training operation with a bunch of guild flags, metrics etc. logged and working well with guild visualization tool.

Now downstream patches the upstream run, but now downstream has its own flags etc. Now when I use guild visualization tool on downstream I am no longer able to track what kind of flags, metrics etc. the upstream operation produced.

This is an issue, since I am only tracking the downstream run and not the upstream run and in the case of generating an onnx model it is still nice to be able to visualize training metrics produced by pytorch (in my example).

I hope this make sense, otherwise please let me know.

Youā€™ve identified an important problem to solve. Itā€™s slightly related to Logging non-numeric output variables.

Guild supports a lot of metadata for runs, but thereā€™s nothing in place to propagate upstream run metadata to downstream, esp for use in compare tables. The upstream flags want to appear in downstream runs but theyā€™re not flags. They represent information about the downstream run. Neither are they simply logged summaries, text or numeric. Guild supports tags, labels, and comments. But these arenā€™t any of those :slight_smile: either.

I can suggest some workaround to help you through this but I unfortunately donā€™t have a solid proposal at this time. This problem must be solved though ā€” so weā€™ll get there. Iā€™ve bumped this in priority as itā€™s been lingering too long.

My recommendation, as a way forward in the short term, is to write the values-of-interest to the downstream flags attribute. This will cause them to appear in compare lists.

Hereā€™s an example you can work from.

This is a bit painful, but it should work. Unfortunately I canā€™t think of a simpler work-around for what you need.

1 Like

Just tried it out - it works great.

Last question: what is the best way to determine whether a python scripts was executed through guild? Is there support for this in the Python API?

You can check for any number of environment variables.

RUN_ID Run ID of current run
RUN_DIR Run directory of current run
GUILD_OP Name of current operation or path to running script

If you want to use a Guild specific API (I recommend not doing this whenever possible as it needlessly locks your code into yet-another API dependency) you can use something like this:

from guild import _api as gapi

run = gapi.current_run()  # raises gapi.NoCurrentRun if no current run

The _api naming convention signals that this module is not officially supported. But you can rely on this interface. At some point it will just be from guild import gapi.

1 Like

My case in similar to this one, basically I have two operations train and evaluate, the current approach is you will have two runs, one contains training flags -nd files-, and other evaluation flags -nd files-. so to check which training run gave this evaluation scores would be painful considering the runs r independents. From another perspective these runs r one experiment nd its parameters should be in one place.

This is a good point. Weā€™ve had a planned feature to support joins across runs to address this. Iā€™ll check the issues list and make sure this gets the proper attention.

I wish there was a straight forward workaround for this. Without proper join support (i.e. Guild automatically connects the upstream run flags/attrs/scalars to the downstream run) youā€™d need to log this yourself in the evaluation run.

To do this, you can read the train flags and merge them into the eval flags. Iā€™m embarrasses to suggest this as itā€™s a pain for something so reasonable. But if it gets you past the issue, perhaps itā€™s a worth while suggestion.

Hereā€™s a gist that does this.

This just gets you flags, however. If you want to include training scalars in your results, youā€™ll need to read the training run scalar values. If thatā€™s something youā€™d like to consider I can get you some code that works.

For a note on a possible alternative to joins (this will get complex as well, as the set of possible joined values can become very large, esp in pipelines), we might consider supporting automatic merging of flags and scalars based on a setting in the dependency. E.g.

train: {}
eval:
  requires:
    - operation: train
      merge-flags: yes
      merge-scalars: yes

The merge attributes might additionally support a string value, which is the name pattern/template to use when merging (e.g. merge-flags: train_{}).

Sorry again, I realize what Iā€™m proposing to you above is a bit ridiculous for you to do yourself. Guild should handle this more elegantly. Weā€™ll get this cleaned up in the next release.