This proposal seeks to address Guild’s requirement that an upstream run be available for an operation dependency. It proposes new operation attributes that control how Guild responds when it can’t find a suitable run for an operation dependency.
fail-if-unresolved may be set to false to prevent run failure and
warn-if-unresolved can be set to silence warnings when a dependency can’t be resolved.
This proposal is awaiting feedback
A run fails when Guild cannot resolve its dependencies. In most situations, this is desirable — the run should not proceed if a required resource is not available.
However, there are cases where a user wants required resource links when the resources exist and not otherwise. Such runs are designed to proceed when the resources are not available.
Consider a training run that make use of a previous training run (e.g. to learn from the previous result or to resume training with saved models). In this case, the operation might be defined like this:
train: requires: - operation: train select: - saved_model.*
If a previous training run doesn’t exist, the operation will fail. This prevents users from using this sort of self-referencing operation.
Guild will support a new operation attributes for a resource source:
fail-if-unresolved is true (default) Guild generates an error when the source cannot be resolved. If false, Guild optionally logs a warning based on
warn-if-unresolved and proceeds with the run. If
warn-if-unresolved is true (default), Guild logs a warning when the resource source cannot be resolved. If false, Guild proceeds with the run without a warning.
To implement a training run that continues when an upstream required operation cannot be found (see above example) an operation can be configured as:
train: requires: - operation: train fail-if-unresolved: false select: - saved_model.*
In this case, Guild attempts to resolve the dependency by finding a non-error run for
train. If it cannot find such a run, it logs a message indicating that it can’t resolve the dependency but continues nonetheless.
To suppress the warning message, the user could specify
warn-if-unresolved: false for the resource source.
One Guild user proposed a work-around for this limitation:
While this is an ingenious workaround, it’s unintuitive and complicated compared to the
optional attribute. Doing nothing here is not an option.
fail-if-unresolved is arguably a bit pedantic/wonky. A simpler
optional flag would be sufficient to address the target problem.
The example above would look like this:
train: requires: - operation: train optional: true select: - saved_model.*
fail-if-xxxis already a convention for options that specify whether or not Guild should continue with the run (e.g.
optionaldoes not control whether or not Guild logs a warning when the source cann’t be resolved.
optionalis paradoxical for a requirement.