Overview
A dependency is a file or other resource required by an operation. Dependencies are defined for an operation in a Guild file using the requires
attribute.
train:
requires:
- file: data.csv
Important Dependencies play a key role in Guild experiments. Guild run directories are initially empty. Any files required by an operation must be defined as dependencies.
Dependencies are defined using resources, which in turn consist of one or more sources. When Guild runs an operation, it resolves all required resource sources. If a source cannot be resolved, Guild stops the run with an error.
When Guild runs train
above it resolves the required file data.csv
as follows:
- Guild looks for
data.csv
relative to the Guild file location. - If Guild finds
data.csv
it creates a link to or a copy ofdata.csv
in the run directory. - If Guild does not find
data.csv
it stops the run with an error message.
Use dependencies to ensure that an operation has what it needs to run.
Inline vs Named Resources
The requires
operation resource is a list of either inline or named resources.
Inline resources are defined directly in the requires
attribute. The example above shows an inline resource consisting of a single file
source.
Named resoures are defined in a model resources
attribute. Named resources are specified as strings in requires
with operation name.
This following shows the dependency above as a named resource:
- operations:
train:
requires:
- data
resources:
data:
- file: data.csv
Use a named resource to:
- Share a resource definition across operations
- Simplify a requirements specification to include only names β this can clarify operation dependencies
Resource Types
Guild supports a number of resource types, or resource source types, which are described below. Refer to Guild File Reference for a specification of each type.
Project Files
To make a project file or directly available for a run, define a file
source.
train:
requires:
- file: data.csv
Guild either links to the file or creates a copy based on the source target-type
attribute. By default Guild creates links to resolved files. To create a copy, specify copy
for target-type
:
train:
requires:
- file: data.csv
target-type: copy
Note Use
copy
to ensure that changes to a file are not applied to current runs by way of links. This is an important consideration when auditing runs. Note however that copying duplicates a file or directory for each run.
copy
will become the default target type in a future version of Guild.
If the specified file is an archive β i.e. has a known archive syntax such as .zip
, .tar
, etc. β Guild unpacks the file as a part of resolving it. You can disable this behavior by setting unpack
to no
.
Use target-path
to resolve sources within subdirectories. For example, to copy data.csv
above into a data
subdirectory, use:
train:
requires:
- file: data.csv
target-type: copy
target-path: data
For additional attributes used to configure sources, see Guild File Reference.
Network Files
To resolve a file located on a network, define url
source.
train:
requires:
- url: http://my.org/data.tar.gz
The same resolution rules for project files (see above) apply to network files.
Guild downloads network files and saves them to a resource cache located in Guild home. By default, Guild creates links to these cached resources. To ensure that a run has a copy of the resolved sources and does not depend on these cached files, use copy
for the source target-type
.
train:
requires:
- url: http://my.org/data.tar.gz
target-type: copy
To ensure that a downloaded source is not corrupt, use the sha256
attribute to define a SHA-256 digest. Guild checks the digest of downloaded files and stops with an error message if it doesnβt match the specified value. Use guid download
to pre-fetch a network file and calculate its SHA-256 digest for use with this attribute.
Refer to see Guild File Reference for other attributes used to configure network file sources.
Run Files
To use files generated by other operations for a run, use an operation
source.
The following defines three operations that constitute a pipeline for preparing data, training, and testing:
prepare-data:
requires:
- file: data.csv
train:
requires:
- operation: prepare-data
test:
requires:
- operation: prepare-data
- operation: train
Guild resolves an operation source by first selecting a run that matches the specified operation name. By default Guild selects the latest non-error run. A user can optionally specify the run ID using flag assignment syntax.
If Guild cannot find a suitable run, it fails with an error message.
When Guild finds a suitable run, it creates links to the files in that run directory. This makes the output of the required run available to the dependent run.
To select a subset of run files, use the select
source attribute. For example, to select only files ending with .hdf5
(a common extension for serialized Keras files):
test:
requires:
- operation: train
select: '.+\.hd5f'
Note Values for
select
are regular expressions and not file system wildcards. This will change in a future version of Guild.
Other source attributes may be used to further configure operation sources.
Configuration Files
To use a configuration file that contains the current run values, use a config
source.
train:
flags:
learning-rate: 0.1
batch-size: 100
requires:
- config: config.yml
When Guild resolves this source, it looks for a project file named config.yml
. It applies the current flag values to the configuration file and writes it to the run directory.
Guild supports two configuration file formats:
Guild uses the extension of the specified file to determine the format.
The file config.yml
referenced above might look like this:
learning-rate: 0.1
batch-size: 100
dropout: 0.2
This file defines three settings, two of which are also defined for the train
operation above. When a user start train
, Guild applies the specified flag values to config.yml
.
The following command sets learning-rate
. The value for batch-size
, defined in the operation above, is unchanged.
guild run train learning-rate=0.2
Guild writes the resolved config.yml
as:
learning-rate: 0.2
batch-size: 100
dropout: 0.3
Note Guild uses the flag name when writing values to configuration files. To write to nested values, use dots in the flag names to denote nesting levels. A future version of Guild will support configuration files using the flags interface, which will provide more flexibility and features to support this case.
Software Modules
To test if a software module is available before starting a run, use a module
source.
train:
requires:
- module: pandas
- module: keras
Guild attempts to load the module before starting the run. If it cannot load the module, it exits with an error message.
Use the help
attribute to provide a user friendly message when the check fails.
train:
requires:
- module: matplotlib
help: operation requires matplotlib - install it using 'pip install matplotlib'
Important Guild does not install modules defined for
requires
β it verifies that the modules are available. Ensure that required modules are installed in the environment before running the operation.
Share Resources Across Models
Use config
objects and inheritance to share resource definitions across models.
Example:
- config: shared-resources
resources:
data:
- file: data.csv
- model: a
extends: shared-resources
operations:
train:
main: train_a
requires: data
- model: b
extends: shared-resources
operations:
train:
main: train_b
requires: data