A dependency is a file or other resource required by an operation. Dependencies are defined for an operation in a Guild file using the
train: requires: - file: data.csv
Important Dependencies play a key role in Guild experiments. Guild run directories are initially empty. Any files required by an operation must be defined as dependencies.
Dependencies are defined using resources, which in turn consist of one or more sources. When Guild runs an operation, it resolves all required resource sources. If a source cannot be resolved, Guild stops the run with an error.
When Guild runs
train above it resolves the required file
data.csv as follows:
- Guild looks for
data.csvrelative to the Guild file location.
- If Guild finds
data.csvit creates a link to or a copy of
data.csvin the run directory.
- If Guild does not find
data.csvit stops the run with an error message.
Use dependencies to ensure that an operation has what it needs to run.
Inline vs Named Resources
requires operation resource is a list of either inline or named resources.
Inline resources are defined directly in the
requires attribute. The example above shows an inline resource consisting of a single
Named resoures are defined in a model
resources attribute. Named resources are specified as strings in
requires with operation name.
This following shows the dependency above as a named resource:
- operations: train: requires: - data resources: data: - file: data.csv
Use a named resource to:
- Share a resource definition across operations
- Simplify a requirements specification to include only names — this can clarify operation dependencies
Guild supports a number of resource types, or resource source types, which are described below. Refer to Guild File Reference for a specification of each type.
To make a project file or directly available for a run, define a
train: requires: - file: data.csv
Guild either links to the file or creates a copy based on the source
target-type attribute. By default Guild creates links to resolved files. To create a copy, specify
train: requires: - file: data.csv target-type: copy
copyto ensure that changes to a file are not applied to current runs by way of links. This is an important consideration when auditing runs. Note however that copying duplicates a file or directory for each run.
copywill become the default target type in a future version of Guild.
If the specified file is an archive — i.e. has a known archive syntax such as
.tar, etc. — Guild unpacks the file as a part of resolving it. You can disable this behavior by setting
target-path to resolve sources within subdirectories. For example, to copy
data.csv above into a
data subdirectory, use:
train: requires: - file: data.csv target-type: copy target-path: data
For additional attributes used to configure sources, see Guild File Reference.
To resolve a file located on a network, define
train: requires: - url: http://my.org/data.tar.gz
The same resolution rules for project files (see above) apply to network files.
Guild downloads network files and saves them to a resource cache located in Guild home. By default, Guild creates links to these cached resources. To ensure that a run has a copy of the resolved sources and does not depend on these cached files, use
copy for the source
train: requires: - url: http://my.org/data.tar.gz target-type: copy
To ensure that a downloaded source is not corrupt, use the
sha256 attribute to define a SHA-256 digest. Guild checks the digest of downloaded files and stops with an error message if it doesn’t match the specified value. Use
guid download to pre-fetch a network file and calculate its SHA-256 digest for use with this attribute.
Refer to see Guild File Reference for other attributes used to configure network file sources.
To use files generated by other operations for a run, use an
The following defines three operations that constitute a pipeline for preparing data, training, and testing:
prepare-data: requires: - file: data.csv train: requires: - operation: prepare-data test: requires: - operation: prepare-data - operation: train
Guild resolves an operation source by first selecting a run that matches the specified operation name. By default Guild selects the latest non-error run. A user can optionally specify the run ID using flag assignment syntax.
If Guild cannot find a suitable run, it fails with an error message.
When Guild finds a suitable run, it creates links to the files in that run directory. This makes the output of the required run available to the dependent run.
To select a subset of run files, use the
select source attribute. For example, to select only files ending with
.hdf5 (a common extension for serialized Keras files):
test: requires: - operation: train select: '.+\.hd5f'
Note Values for
selectare regular expressions and not file system wildcards. This will change in a future version of Guild.
Other source attributes may be used to further configure operation sources.
To use a configuration file that contains the current run values, use a
train: flags: learning-rate: 0.1 batch-size: 100 requires: - config: config.yml
When Guild resolves this source, it looks for a project file named
config.yml. It applies the current flag values to the configuration file and writes it to the run directory.
Guild supports two configuration file formats:
Guild uses the extension of the specified file to determine the format.
config.yml referenced above might look like this:
learning-rate: 0.1 batch-size: 100 dropout: 0.2
This file defines three settings, two of which are also defined for the
train operation above. When a user start
train, Guild applies the specified flag values to
The following command sets
learning-rate. The value for
batch-size, defined in the operation above, is unchanged.
guild run train learning-rate=0.2
Guild writes the resolved
learning-rate: 0.2 batch-size: 100 dropout: 0.3
Note Guild uses the flag name when writing values to configuration files. To write to nested values, use dots in the flag names to denote nesting levels. A future version of Guild will support configuration files using the flags interface, which will provide more flexibility and features to support this case.
To test if a software module is available before starting a run, use a
train: requires: - module: pandas - module: keras
Guild attempts to load the module before starting the run. If it cannot load the module, it exits with an error message.
help attribute to provide a user friendly message when the check fails.
train: requires: - module: matplotlib help: operation requires matplotlib - install it using 'pip install matplotlib'
Important Guild does not install modules defined for
requires— it verifies that the modules are available. Ensure that required modules are installed in the environment before running the operation.
Share Resources Across Models
config objects and inheritance to share resource definitions across models.
- config: shared-resources resources: data: - file: data.csv - model: a extends: shared-resources operations: train: main: train_a requires: data - model: b extends: shared-resources operations: train: main: train_b requires: data