Guild home default location

Summary

This proposal seeks to align Guild’s default behavior with user expectation for where Guild stores runs and how it reports runs for Guild commands. It introduces the concept of a “project directory” and uses that location as Guild home when commands are run within that directory or one of its subdirectories.

This proposal is under development

Background

Guild Home is a directory that contains runs, configuration, and cached files. It’s a central concept in Guild that impacts all users. By default Guild home is ~/.guild. Runs from across a system are stored in the same location and can therefore be compared to one another.

If Guild runs within a virtual environment (created by virtualenv/venv or Conda), Guild home is $ENV/.guild. This has the effect of isolating runs generated within an activated virtual environment to that environment.

The environment variable GUILD_HOME can be set to override the default behavior.

The guild command option -H can be used to override all other methods of determining Guild home.

Problem

A typical user workflow runs experiments for a specific project. A user is concerned with project editing (coding, etc.) and running experiments. The user is interested in viewing and comparing runs within that project and is not concerned with other projects within that context.

It’s less common for a user to compare runs across projects than to compare runs within one project.

Guild’s default location (~/.guild) consolidates all runs across all projects. Run related commands therefore apply to all runs, regardless of where the commands are run.

Seeing runs across projects is distracting when a user wants to focus on a single project.

Guild circumvents this problem by changing its behavior in the case of virtual environments. When a virtual environment is activated, Guild uses a different home location. It looks for runs in $ENV/.guild. This has the effect of consolidating runs per environment. As Python projects are commonly associated with a virtual environment, this in turn isolated runs per project — in the case of typical Python projects.

Projects that don’t use Python virtual environments (e.g. R projects) don’t see this behavior. To isolate runs per project in that case, a user would have to set GUILD_HOME or construct and activate a Python virtual environment before running Guild commands. Neither of these options is acceptable.

Proposed Approach

Guild will use the following methods of specifying Guild home, in order of precedence:

  • -H with a command
  • GUILD_HOME environment variable
  • .guild or a project sentinel in the current directory (Guild home being the .guild subdirectory peer of a project sentinel file)
  • Apply the previous rule to parent directories
  • ~/.guild

A project sentinel is a file that implies a project directory. Examples include .Rprofile, .Renviron, and .Rproj (common R language project oriented files), .vscode, .idea, .metadata (popular IDEs project directories), etc.

The sentinel file list would be generated by plugins so that Guild can be extended with both language and IDE support.

Virtual environments

Guild currently stores runs within an activate virtual environment top-level directory. For example, when a virtual environment located in ./venv is activated (e.g. by running source venv/bin/activate) Guild uses ./venv/.guild as Guild home. It’s unconventional to store user-data within a virtual environment, which is often treated as a disposable structure that provides runtime isolation for static libraries.

Guild would continue work this way for virtual environment directories that contain .guild to support existing environments.

If .guild does not exist in the activated environment, Guild would proceed with the above Guild home resolution steps, not considering the environment unless it was otherwise detected using the standard method.

Alternative Approaches

Maintain current behavior (no change)

In this case, R developers cannot have project level isolation unless they explicitly modify their environment before running commands.

While activating an environment is a common step for Python developers, it is a novel practice for R users.

This is not a viable option is Guild wants to meet R user expectations of project-isolated runs.

Centralized runs with smart filtering

This approach would introduce the following changes:

  • Default Guild home to ~/.guild
  • Rely on GUILD_HOME and the -H command option for explicit overrides
  • Ignore activated environments (with backward compatibility support for existing environments containing Guild runs)
  • Modify run-related commands to filter runs based on the current directory

By default, run-related commands would show runs that originated from a location at or under the current directory.

Consider the following runs:

  • Run a from directory /foo
  • Run b from directory /bar

If guild runs is run from directory /foo or any of its subdirectories, the list contains run a.

If guild runs is run from directory /bar or any of its subdirectories, the list contains run b.

If guild runs is run from the root directory, the list contains both run a and b.

Run-related commands would support an option to disable this filtering (e.g. -G, --global).

Advantages

With this approach, Guild runs are consolidated in a single location by default. If a user wants to change that location, she must explicitly set GUILD_HOME. Different Guild home locations would be considered an advanced case and added step of setting that value (e.g. through explicit activation or configuration) might be considered a feature — explicit being preferred over implicit.

Runs are no longer spread across multiple projects.

Runs can be compared across projects without multi-environment support in Guild or the need to replicate runs across environments.

Runs are not deleted when projects are deleted. This may be considered a disadvantage, however.

Disadvantages

This approach is less project-centric, which might be a conceptual problem for users.

It’s possible to orphan runs when a project is deleted. This may be considered an advantage, however.

1 Like