Command: compare

guildai · June 10, 2020, 12:21am

Usage

guild compare [OPTIONS] [RUN...]

Compare run results.

Guild Compare is a console based application that displays a spreadsheet of runs with their current accuracy and loss. The application will continue to run until you exit it by pressing q (for quit).

Guild Compare supports a number of commands. Commands are activated by pressing a letter. To view the list of commands, press ?.

Guild Compare does not automatically update to display the latest available data. If you want to update the list of runs and their status, press r (for refresh).

You may alternative use the --csv option to write a CSV file containing the compare data. To print the CSV contents to standard output, use ‘-’ for the file path.

Compare Columns

Guild Compare shows columns for each run based on the columns defined for each run operation. Additional columns may be specified using the --columns option, which must be a comma separated list of column specs. See below for column spec details.

If multiple columns have the same name, they are merged into a single column. Cell values are merged by taking the first non-null value in the list of cells with the common name from left-to-right.

By default, columns always contain run ID, model, operation, started, time, label, status, and the set of columns defined for each displayed operation. You can skip the core columns by with --skip-core and skip the operation columns with --skip-op-cols.

Column Specs

Each column specified in COLUMNS must be a valid column spec. A column spec is the name of a run flag or scalar key. Flag names must be preceded by an equals sign = to differentiate them from scalar keys.

For example, to include the flag epochs as a column, use --columns =epochs.

If a scalar is specified, it may be preceded by a qualifier of min, max, first, last, avg, total, or count to indicate the type of scalar value. For example, to include the highest logged value for accuracy, use --columns "max accuray".

By default last is assumed, so that the last logged value for the specified scalar is used.

A scalar spec may additionally contain the key word step to indicate that the step associated with the scalar is used. For example, to include the step of the last accuracy value, use --columns "accuracy step". Step may be used with scalar qualifiers. For example, to include the value and associated step of the lowest loss, use --columns "min loss, min loss step".

Column specs may contain an alternative column heading using the keyword as in the format COL as HEADING. Headings that contain spaces must be quoted.

For example, to include the scalar val_loss with name validation loss, use --columns val_loss as 'validation loss'.

You may include run attributes as column specs by preceding the run attribute name with a period .. For example, to include the stopped attribute, use --columns .stopped. This is useful when using --skip-core.

Sort Runs

Use --min and --max to sort results by a particular column. --min sorts in ascending order and --max sorts in descending order.

When specifying COLUMN, use the column name as displayed in the table output. If the column name contains spaces, quote the value.

By default, runs are sorted by start time in ascending order - i.e. the most recent runs are listed first.

Limit Runs

To limit the results to the top N runs, use --top.

Specify Runs

You may use one or more RUN arguments to indicate which runs apply to the command. RUN may be a run ID, a run ID prefix, or a one-based index corresponding to a run returned by the list command.

Indexes may also be specified in ranges in the form START:END where START is the start index and END is the end index. Either START or END may be omitted. If START is omitted, all runs up to END are selected. If END id omitted, all runs from START on are selected. If both START and END are omitted (i.e. the : char is used by itself) all runs are selected.

If a RUN argument is not specified, : is assumed (all runs are selected).

Filter by Operation

Runs may be filtered by operation using --operation. A run is only included if any part of its full operation name, including the package and model name, matches the value.

Use --operation multiple times to include more runs.

Filter by Label

Use --label to only include runs with labels containing a specified value. To select runs that do not contain a label, specify a dash ‘-’ for VAL.

Use --label multiple times to include more runs.

Filter by Tag

Use --tag to only include runs with a specified tag. Tags must match completely and are case sensitive.

Use --tag multiple times to include more runs.

Filter by Marked and Unmarked

Use --marked to only include marked runs.

Use --unmarked to only include unmarked runs. This option may not be used with --marked.

Filter by Expression

Use --filter to limit runs that match a filter expressions. Filter expressions compare run attributes, flag values, or scalars to target values. They may include multiple expressions with logical operators.

For example, to match runs with flag batch-size equal to 100 that have loss less than 0.8, use:

--filter 'batch-size = 10 and loss < 0.8'

IMPORTANT: You must quote EXPR if it contains spaces or characters that the shell uses (e.g. ‘<’ or ‘>’).

Target values may be numbers, strings or lists containing numbers and strings. Strings that contain spaces must be quoted, otherwise a target string values does not require quotes. Lists are defined using square braces where each item is separated by a comma.

Comparisons may use the following operators: ‘=’, ‘!=’ (or ‘<>’), ‘<’, ‘<=’, ‘>’, ‘>=’. Text comparisons may use ‘contains’ to test for case-insensitive string membership. A value may be tested for membership or not in a list using ‘in’ or ‘not in’ respectively. An value may be tested for undefined using ‘is undefined’ or defined using ‘is not undefined’.

Logical operators include ‘or’ and ‘and’. An expression may be negated by preceding it with ‘not’. Parentheses may be used to control the order of precedence when expressions are evaluated.

If a value reference matches more than one type of run information (e.g. a flag is named ‘label’, which is also a run attribute), the value is read in order of run attribute, then flag value, then scalar. To disambiguate the reference, use a prefix attr:, flag:, or scalar: as needed. For example, to filter using a flag value named ‘label’, use ‘flag:label’.

Other examples:

operation = train and acc > 0.9
operation = train and (acc > 0.9 or loss < 0.3)
batch-size = 100 or batch-size = 200
batch-size in [100,200]
batch-size not in [400,800]
batch-size is undefined
batch-size is not undefined
label contains best and operation not in [test,deploy]
status in [error,terminated]

NOTE: Comments and tags are not supported in filter expressions at this time. Use --comment and --tag options along with filter expressions to further refine a selection.

Filter by Run Start Time

Use --started to limit runs to those that have started within a specified time range.

IMPORTANT: You must quote RANGE values that contain spaces. For example, to filter runs started within the last hour, use the option:

--started 'last hour'

You can specify a time range using several different forms:

DATETIME may be specified as a date in the format YY-MM-DD (the leading YY- may be omitted) or as a time in the format HH:MM (24 hour clock). A date and time may be specified together as DATE TIME.

When using between DATETIME and DATETIME, values for DATETIME may be specified in either order.

When specifying values like minutes and hours the trailing s may be omitted to improve readability. You may also use min instead of minutes and hr instead of hours.

Examples:

after 7-1
after 9:00
between 1-1 and 4-30
between 10:00 and 15:00
last 30 min
last 6 hours
today
this week
last month
3 weeks ago

Filter by Source Code Digest

To show runs for a specific source code digest, use -g or --digest with a complete or partial digest value.

Filter by Run Status

Runs may also be filtered by specifying one or more status filters: --running, --completed, --error, and --terminated. These may be used together to include runs that match any of the filters. For example to only include runs that were either terminated or exited with an error, use --terminated --error, or the short form -Set.

You may combine more than one status character with -S to expand the filter. For example, -Set shows only runs with terminated or error status.

Status filters are applied before RUN indexes are resolved. For example, a run index of 1 is the latest run that matches the status filters.

Tools

Guild provides support for comparing runs using different tools. To use a different tool, specify its name with the --tool option.

Supported tools:

hiplot Facebook’s HiPlot high dimensionality viewer
for evaluating hyperparameters.

Show Batch Runs

By default, batch runs are not included in comparisons. To include batch runs, specify --include-batch.

Options


`--min COLUMN`	Show the lowest values for COLUMN first.
`--max COLUMN`	Show the highest values for COLUMN first.
`-n, --limit, --top N`	Only show the top N runs.
`-e, --extra-cols`	Show extra columns such as source code hash.
`-s, --all-scalars`	Include all scalars. By default system scalars are omitted.
`-c, --cols COLUMNS`	Comma delimited list of additional columns to compare. See Column Specs for information on column format. Cannot be used with --strict-columns.
`-cc, --strict-cols COLUMNS`	Comma delimited list of the columns to compare. See Column Specs for information on column format. Cannot be used with --columns.
`-p, --skip-op-cols`	Don’t show columns specified by the ‘compare’ operation attribute in the Guild file.
`-r, --skip-core`	Don’t show core columns.
`-u, --skip-unchanged`	Don’t show columns with values that do not change.
`-t, --table`	Show comparison data as a table.
`--csv PATH`	Save comparison data to a CSV file. Use ‘-’ for PATH to print to standard output.
`--tool TOOL`	Use TOOL to compare runs. See TOOLS for a list of supported tools
`--include-batch`	Include batch runs.
`--print-scalars`	Show available scalars and exit.
`-F, --filter EXPR`	Filter runs using a filter expression. See Filter by Expression above for details.
`-Fo, --operation VAL`	Filter runs with operations matching `VAL`.
`-Fl, --label VAL`	Filter runs with labels matching VAL. To show unlabeled runs, use --unlabeled.
`-Fu, --unlabeled`	Filter runs without labels.
`-Ft, --tag TAG`	Filter runs with TAG.
`-Fc, --comment VAL`	Filter runs with comments matching VAL.
`-Fm, --marked`	Filter marked runs.
`-Fn, --unmarked`	Filter unmarked runs.
`-Fs, --started RANGE`	Filter runs started within RANGE. See above for valid time ranges.
`-Fd, --digest VAL`	Filter runs with a matching source code digest.
`-Sr, --running / --not-running`	Filter runs that are still running.
`-Sc, --completed / --not-completed`	Filter completed runs.
`-Se, --error / --not-error`	Filter runs that exited with an error.
`-St, --terminated / --not-terminated`	Filter runs terminated by the user.
`-Sp, --pending / --not-pending`	Filter pending runs.
`-Ss, --staged / --not-staged`	Filter staged runs.
`--help`	Show this message and exit.

Guild AI version 0.9.0

Topic		Replies	Views
Guild Compare Tools	0	3265	June 12, 2020
Rearranging columns in guild compare Troubleshooting	2	444	December 8, 2020
Command: api compare Commands	0	330	October 18, 2022
Guild compare: How to only show columns with different values? General	6	869	July 7, 2022
Guild Diff Tools	0	1683	June 12, 2020