Run names

Summary

This proposal outlines a run-identification scheme that uses human readable names rather than run IDs.

It’s yet-to-be-determined if this is a breaking change.

This proposal is under development

Problem

Guild uses UUIDs to uniquely identify runs. These are meaningless identifiers and used only to differentiate runs.

These IDs are presented to users in several ways:

  • User interface: run listings and run metadata
  • User input: when specifying operation dependencies

While labels are useful for capturing higher level summary information, they are not intended to identify runs. Guild does not provide a user-friendly mechanism for identifying runs.

Proposed Approach

Guild will introduce a run name scheme. A run name is used to uniquely identify a run in the context of a Guild environment.

Names will be used in all applicable UI to represent a run. Guild will continue to show run IDs where it makes sense (e.g. guild runs info).

Guild will use a name generator to create human readable names.

Issues

There are a number of issues that must be addressed before proceeding with this proposal.

  • How are names automatically generated
  • Are names generated for every run
  • Are unique names generated for every batch trial
  • How are duplicate names resolved for new runs
  • How are duplicate names resolved for run-merge operations (e.g. push, pull, export, import)
  • How are long names presented in compact listings (e.g. guild runs list)
  • Can runs be renamed and if so, how are references to those runs that use names be treated
  • Do we want to optionally disable run names with user config? (e.g. an option or a way to configure the columns that are used in terminal listings)

Name generating algorithm

This is a hard problem. While automatic name generators are common, they are not without problems. We need to look closely at this history and learn from it.

The temptation is to use English for name generation. This is fine for English speaking users. For non-English users, generated names may have little value over hexadecimal values.

While Guild is not internationalized, it may be at some point. This then suggests language packs that include word lists for run name generation.

Selective name generation

Should every run use a unique name? Probably not. At the least, trials ought to use a shared name with either incrementing suffixes or using a timestamp.

Compact run lists

How are longer names presented in run lists in a terminal where column space is at a premium? Do we allocate more space to the run name column? Do we truncate names as we do run IDs? Do we split names to reflect the start and end characters as we do for operation names?

Are names used in place of run IDs or as a separate column?

If this scheme is to be usable, we ought to present a usable value in compact lists that can be used when specifying the run for a downstream operation.

Name conflicts

As run names will not be globally unique, we must support cases where multiple runs share the same name.

Trial names

Guild ought not to use unique names for trials. It should use a single name (the name associated with the batch) and incrementing suffixes for each trial name.

E.g. the command guild run train lr=[1e-4,1e-3] --name red-robin generates trials of red-robin:1 and red-robin:2.

Working decisions

The following is a working list of decisions associated with the final proposal:

  • Use a new run attribute name
  • Continue to show [<index>:<hash>] in column 1 of terminal run listings (guild runs list)
  • Otherwise leave support for using run IDs exactly as they are
  • Use a new column (likely to appear to the right of the operation column)
  • Use the value splitting scheme used for showing run sources to maintain a fixed column width for names
  • Look for a way to get space from other columns (e.g. single char symbol for run status, more compact date/time format, drop the run location info)
  • Support use of run names for operation dependencies - use the resolved run names in resource flag names and labels
  • Continue to use run IDs when recording resolved dependencies (i.e. the run deps attribute)
  • Name collisions are not a problem as names are treated in the same way as labels and tags - they just have a special use in UI and dependency specification
  • Do not generate names by default but provide a --name option to the run command that accepts 0-1 args — when no args, Guild generates a name (or alternatively --name <val> and --auto-name)
  • Support user config to enable auto-name generation by default — provide a --no-name option to run to bypass that (this could be deferred for a followup enhancement if the feature is generally adopted)
  • Trials use the name of their batch with incrementing suffixes
  • Use English for the naming scheme with an eye toward different languages when Guild is internationalized
  • Find a naming scheme that minimizes the problems associated with random name generation (e.g. negative connotations, personalized, etc.)
  • Support renames
  • Do not attempt to modify other runs that use old names (e.g. a run with a label or flag value containing an old name will not be updated to reflect renames)

Alternative Approaches

Do nothing

As with all proposals, we must consider the option of keeping the scheme as-is.

The advantage of using UUIDs is that they address each of the issues listed above. The disadvantage is that they’re not user-friendly (not memorable, hard-to-specify). This is a trade off between straight forward implementation and behavior and usability. It’s not clear that any of the proposed approaches do not introduce new usability problems.

Aliases

This alternative is likely in name only: an alias could be considered another term for name. The term alias might be a more appropriate way to think of this scheme, as the run otherwise remains the same. Aliases may be optionally used to refer to a run (whereas the concept of name implies a primary identifier).

Use tags

Guild supports tags, which are lists of string values associated with a run. A tag could be used as a human-readable identifier for a run. Guild could generate a unique tag value for each run, either by default or when a run command option is specified.

This alternative does not address the usability concerns unless it’s displayed prominently in UI (e.g. run lists, etc.) It does not address any of the issues listed above. It’s merely an alternative encoding scheme.

To address the usability concerns, we could support a tag naming scheme name:<value>. But at this point, a distinct name run attribute is far cleaner.

Pros: avoids introducing a new run attribute or run-identification concept

Cons: merely skirts the issue as it would require a specialized tag naming convention to show the name in run lists