Summary
This is a proposal for a feature that lets users run operations that use one or more Guild runs as inputs. Consider the case where a user wants to analyze a set of runs to calculate average performance and to select the best performing model. The summary operation needs to know what runs to analyze and have easy access to those runs to perform its work.
This proposal is awaiting feedback
Problem
It’s often useful to perform analysis on Guild-generated runs. There are a number of common use cases:
- Generate a report on the models generated from a particular data set
- Select a production candidate from many possible models
- Generate a new model from a set of models or a set of datasets, which are represented by one or more Guild runs
It’s possible for users to manually scan runs using either guild.ipy
or the yet-to-be-released API guild._api
. However, this requires tedious and potentially error-prone programming.
Guild should make this process as easy as possible.
Proposed Approach
Guild should introduce a new operation type: “summary operation”. A summary operation is a standard Guild operation that requires a set of runs. This requirement is expressed as a Guild dependency.
The operation
dependency type should be extended to support multiple runs by way of a multi-run
source attribute.
op: guild.pass
summary:
requires:
- operation: op
multi-run: yes
When an operation dependency is multi-run, Guild resolves the dependency by selecting and linking to each matching run. Links are created in the summary operation run directory by default, or under target-path
as specified in the dependency source. Links are named using the full run ID and link to the corresponding directory under $GUILD_HOME/runs
.
guild-runs.json
For multi-run dependencies, Guild generates a guild-runs.json
file in the same directory as the linked runs. This file contains likely-useful metatadata for each linked run.
// guild-runs.json - located alongside the linked runs in summary op run dir
[
{ "id": "xxxyyy",
"dir": "./xxxyyy",
"status": "completed",
"flags": {...},
"scalars": {...},
...
}, ...
]
Run selection
As a part of this proposal, the operation
dependency type will be extended to support a select
attribute. select
is a query-like expression Guild uses to resolve the required runs. This is an extension of the operation
attribute value currently used, which only supports operation name selection.
The select
attribute can be used to test a run using criteria for run attributes, flag values, and scalars. The select specification will support boolean expressions.
summary:
requires:
- operation: op
multi-run: yes
select: label contains 'red' and completed
NOTE: The select feature will also be made available in the guild select
command.
Command line selection
A user may specify a select spec for a multi-run dependency by prefixing the dependency name with where
in a flag-like assignment:
guild run summary op="where label contains 'red' and complete"
Run IDs may be specified using comma or space-delimited lists of full or partial run IDs.
guild run summary op="abcd1234 defa5678 bcde9012"
Summary op preview
Guild will fully resolve the runs to link before starting the operation and show the selected runs in a preview.
You are about to run summary
The following runs are selected:
[63d8c402] op 2022-05-10 09:51:19 completed
[52a07a44] op 2022-05-10 09:51:18 completed
[c8d00fb7] op 2022-05-10 09:51:17 completed
[65895e44] op 2022-05-10 09:51:16 completed
[ca4f560e] op 2022-05-10 09:51:15 completed
[b0567025] op 2022-05-10 09:50:49 completed
Continue? (Y/n)
Alternative Approaches
Deprecate operation
in favor of run
and multi-run
dependencies
The current operation dependency is arguably misspelled. Strictly speaking, a downstream operation requires a run. We might consider renaming this dependency type accordingly.
upstream: guild.pass
downstream:
requires:
- run: upstream
The run
attribute here would be the select expression. In this case, the spec is shorthand for:
downstream:
requires:
- run: op = upstream
In the case of multi-run
, the configuration would be:
downstream:
requires:
- multi-run: upstream
This distinction at the top-level is defensible, considering the differences in the way run and multi-run sources are resolved. For a run, a single run is selected and its contents — i.e. the files inside the run directory — are resolved within the downstream run directory. For multi-run, the top-level run directories themselves are resolved by links within the downstream (summary op) run directory. This difference is arguably better highlighted by making run and multi-run dependencies distinct at the top-level (as opposed to when an attribute is set to true).
The operation
dependency type would be forever supported but officially deprecated in favor of run
and multi-run
.
This spelling has some advantages over the above proposal:
- The term “run dependency” is more accurate than “operation dependency” — “operation” is even inaccurate in cases where a select spec omits an operation name, should that be supported (it technically could be, e.g. where the select looked for runs containing certain files, independent of the operation name)
- Clarifies the operation dependency as a single run dependency
- Makes the distinction between run and multi-run clearer
- Removes the need for a separate
select
attributes
Cons to this approach:
- Cognitive shift/lift needed by users might not be justified by the benefits
- Updated documentation
- Standard cost of maintaining a deprecated setting (docs, implementation, and tests)