Usage
guild compare [OPTIONS] [RUN...]
Compare run results.
Guild Compare is a console based application that displays a spreadsheet of runs with their current accuracy and loss. The application will continue to run until you exit it by pressing q
(for quit).
Guild Compare supports a number of commands. Commands are activated by pressing a letter. To view the list of commands, press ?
.
Guild Compare does not automatically update to display the latest available data. If you want to update the list of runs and their status, press r
(for refresh).
You may alternative use the --csv
option to write a CSV file containing the compare data. To print the CSV contents to standard output, use ‘-’ for the file path.
Compare Columns
Guild Compare shows columns for each run based on the columns defined for each run operation. Additional columns may be specified using the --columns
option, which must be a comma separated list of column specs. See below for column spec details.
If multiple columns have the same name, they are merged into a single column. Cell values are merged by taking the first non-null value in the list of cells with the common name from left-to-right.
By default, columns always contain run ID, model, operation, started, time, label, status, and the set of columns defined for each displayed operation. You can skip the core columns by with --skip-core
and skip the operation columns with --skip-op-cols
.
Column Specs
Each column specified in COLUMNS
must be a valid column spec. A column spec is the name of a run flag or scalar key. Flag names must be preceded by an equals sign =
to differentiate them from scalar keys.
For example, to include the flag epochs
as a column, use --columns =epochs
.
If a scalar is specified, it may be preceded by a qualifier of min
, max
, first
, last
, avg
, total
, or count
to indicate the type of scalar value. For example, to include the highest logged value for accuracy
, use --columns "max accuray"
.
By default last
is assumed, so that the last logged value for the specified scalar is used.
A scalar spec may additionally contain the key word step
to indicate that the step associated with the scalar is used. For example, to include the step of the last accuracy
value, use --columns "accuracy step"
. Step may be used with scalar qualifiers. For example, to include the value and associated step of the lowest loss, use --columns "min loss, min loss step"
.
Column specs may contain an alternative column heading using the keyword as
in the format COL as HEADING
. Headings that contain spaces must be quoted.
For example, to include the scalar val_loss
with name validation loss
, use --columns val_loss as 'validation loss'
.
You may include run attributes as column specs by preceding the run attribute name with a period .
. For example, to include the stopped
attribute, use --columns .stopped
. This is useful when using --skip-core
.
Sort Runs
Use --min
and --max
to sort results by a particular column. --min
sorts in ascending order and --max
sorts in descending order.
When specifying COLUMN
, use the column name as displayed in the table output. If the column name contains spaces, quote the value.
By default, runs are sorted by start time in ascending order - i.e. the most recent runs are listed first.
Limit Runs
To limit the results to the top N
runs, use --top
.
Specify Runs
You may use one or more RUN
arguments to indicate which runs apply to the command. RUN
may be a run ID, a run ID prefix, or a one-based index corresponding to a run returned by the list command.
Indexes may also be specified in ranges in the form START:END
where START
is the start index and END
is the end index. Either START
or END
may be omitted. If START
is omitted, all runs up to END
are selected. If END
id omitted, all runs from START
on are selected. If both START
and END
are omitted (i.e. the :
char is used by itself) all runs are selected.
If a RUN
argument is not specified, :
is assumed (all runs are selected).
Filter by Operation
Runs may be filtered by operation using --operation
. A run is only included if any part of its full operation name, including the package and model name, matches the value.
Use --operation
multiple times to include more runs.
Filter by Label
Use --label
to only include runs with labels containing a specified value. To select runs that do not contain a label, specify a dash ‘-’ for VAL
.
Use --label
multiple times to include more runs.
Filter by Tag
Use --tag
to only include runs with a specified tag. Tags must match completely and are case sensitive.
Use --tag
multiple times to include more runs.
Filter by Marked and Unmarked
Use --marked
to only include marked runs.
Use --unmarked
to only include unmarked runs. This option may not be used with --marked
.
Filter by Expression
Use --filter
to limit runs that match a filter expressions. Filter expressions compare run attributes, flag values, or scalars to target values. They may include multiple expressions with logical operators.
For example, to match runs with flag batch-size
equal to 100 that have loss
less than 0.8, use:
--filter 'batch-size = 10 and loss < 0.8'
IMPORTANT: You must quote EXPR if it contains spaces or characters that the shell uses (e.g. ‘<’ or ‘>’).
Target values may be numbers, strings or lists containing numbers and strings. Strings that contain spaces must be quoted, otherwise a target string values does not require quotes. Lists are defined using square braces where each item is separated by a comma.
Comparisons may use the following operators: ‘=’, ‘!=’ (or ‘<>’), ‘<’, ‘<=’, ‘>’, ‘>=’. Text comparisons may use ‘contains’ to test for case-insensitive string membership. A value may be tested for membership or not in a list using ‘in’ or ‘not in’ respectively. An value may be tested for undefined using ‘is undefined’ or defined using ‘is not undefined’.
Logical operators include ‘or’ and ‘and’. An expression may be negated by preceding it with ‘not’. Parentheses may be used to control the order of precedence when expressions are evaluated.
If a value reference matches more than one type of run information (e.g. a flag is named ‘label’, which is also a run attribute), the value is read in order of run attribute, then flag value, then scalar. To disambiguate the reference, use a prefix attr:
, flag:
, or scalar:
as needed. For example, to filter using a flag value named ‘label’, use ‘flag:label’.
Other examples:
operation = train and acc > 0.9
operation = train and (acc > 0.9 or loss < 0.3)
batch-size = 100 or batch-size = 200
batch-size in [100,200]
batch-size not in [400,800]
batch-size is undefined
batch-size is not undefined
label contains best and operation not in [test,deploy]
status in [error,terminated]
NOTE: Comments and tags are not supported in filter expressions at this time. Use --comment
and --tag
options along with filter expressions to further refine a selection.
Filter by Run Start Time
Use --started
to limit runs to those that have started within a specified time range.
IMPORTANT: You must quote RANGE values that contain spaces. For example, to filter runs started within the last hour, use the option:
--started 'last hour'
You can specify a time range using several different forms:
after DATETIME
before DATETIME
between DATETIME and DATETIME
last N minutes|hours|days
today|yesterday
this week|month|year
last week|month|year
N days|weeks|months|years ago
DATETIME
may be specified as a date in the format YY-MM-DD
(the leading YY-
may be omitted) or as a time in the format HH:MM
(24 hour clock). A date and time may be specified together as DATE TIME
.
When using between DATETIME and DATETIME
, values for DATETIME
may be specified in either order.
When specifying values like minutes
and hours
the trailing s
may be omitted to improve readability. You may also use min
instead of minutes
and hr
instead of hours
.
Examples:
after 7-1
after 9:00
between 1-1 and 4-30
between 10:00 and 15:00
last 30 min
last 6 hours
today
this week
last month
3 weeks ago
Filter by Source Code Digest
To show runs for a specific source code digest, use -g
or --digest
with a complete or partial digest value.
Filter by Run Status
Runs may also be filtered by specifying one or more status filters: --running
, --completed
, --error
, and --terminated
. These may be used together to include runs that match any of the filters. For example to only include runs that were either terminated or exited with an error, use --terminated --error
, or the short form -Set
.
You may combine more than one status character with -S
to expand the filter. For example, -Set
shows only runs with terminated or error status.
Status filters are applied before RUN
indexes are resolved. For example, a run index of 1
is the latest run that matches the status filters.
Tools
Guild provides support for comparing runs using different tools. To use a different tool, specify its name with the --tool
option.
Supported tools:
hiplot Facebook’s HiPlot high dimensionality viewer
for evaluating hyperparameters.
Show Batch Runs
By default, batch runs are not included in comparisons. To include batch runs, specify --include-batch
.
Options
--min COLUMN |
Show the lowest values for COLUMN first. |
--max COLUMN |
Show the highest values for COLUMN first. |
-n, --limit, --top N |
Only show the top N runs. |
-e, --extra-cols |
Show extra columns such as source code hash. |
-s, --all-scalars |
Include all scalars. By default system scalars are omitted. |
-c, --cols COLUMNS |
Comma delimited list of additional columns to compare. See Column Specs for information on column format. Cannot be used with --strict-columns. |
-cc, --strict-cols COLUMNS |
Comma delimited list of the columns to compare. See Column Specs for information on column format. Cannot be used with --columns. |
-p, --skip-op-cols |
Don’t show columns specified by the ‘compare’ operation attribute in the Guild file. |
-r, --skip-core |
Don’t show core columns. |
-u, --skip-unchanged |
Don’t show columns with values that do not change. |
-t, --table |
Show comparison data as a table. |
--csv PATH |
Save comparison data to a CSV file. Use ‘-’ for PATH to print to standard output. |
--tool TOOL |
Use TOOL to compare runs. See TOOLS for a list of supported tools |
--include-batch |
Include batch runs. |
--print-scalars |
Show available scalars and exit. |
-F, --filter EXPR |
Filter runs using a filter expression. See Filter by Expression above for details. |
-Fo, --operation VAL |
Filter runs with operations matching VAL . |
-Fl, --label VAL |
Filter runs with labels matching VAL. To show unlabeled runs, use --unlabeled. |
-Fu, --unlabeled |
Filter runs without labels. |
-Ft, --tag TAG |
Filter runs with TAG. |
-Fc, --comment VAL |
Filter runs with comments matching VAL. |
-Fm, --marked |
Filter marked runs. |
-Fn, --unmarked |
Filter unmarked runs. |
-Fs, --started RANGE |
Filter runs started within RANGE. See above for valid time ranges. |
-Fd, --digest VAL |
Filter runs with a matching source code digest. |
-Sr, --running / --not-running |
Filter runs that are still running. |
-Sc, --completed / --not-completed |
Filter completed runs. |
-Se, --error / --not-error |
Filter runs that exited with an error. |
-St, --terminated / --not-terminated |
Filter runs terminated by the user. |
-Sp, --pending / --not-pending |
Filter pending runs. |
-Ss, --staged / --not-staged |
Filter staged runs. |
--help |
Show this message and exit. |
Guild AI version 0.9.0