Summary
This proposal seeks to formalize Guild’s project configuration (a Guild file) as a data structure.
This proposal is awaiting feedback
Problem
Guild supports a flexible project configuration structure. However, the structure is not formally defined. It is implemented as a set of transformation behaviors defined in the Python module guild.guildfile
and in supporting modules, including an externally configurable plugin system.
The lack of a formal structure presents the following problems:
- It’s impractical to validate the correctness of project configuration
- There is no point where a final, canonical data structure exists (to be viewed, saved, etc.)
- It’s impractical to save complete project configuration in a run (Guild currently saves subsets of the configuration to run attributes for specific uses)
- It’s impractical to provide IDE and other tool support (e.g. code completion) for editing project configuration
- It’s impractical to apply generalize algorithms to merge or otherwise combine multiple configurations (a problem presented by the original R plugin, which does not support Guild file configuration)
Proposed Approach
Guild will provide an external schema for project configuration. This should be defined outside of Python imperative code (e.g. JSON). This schema will be versioned and correspond to Guild’s feature set at the time of publishing.
Guild will provide a command to check/validate project configuration.
Guild will formalize the process of generating a fully resolved project configuration data structure. This will include the stages: reading configuration from disk, coercing simplified configuration to its canonical form, and transformations applied by plugins. Guild will support visibility and validation into each stage, which will be used for internal testing, external tool validation, and user validation of project configuration.
To Do
- Identify a schema system to use
Consider how the system is generally supported outside Python and useful to other tools. It should be possible to generate a disk-based artifact and validate it using pre-existing tools (not home grown). The system must support tracking original configuration files and line numbers per configuration value.
If such a system doesn’t exist (it might not!) we could consider writing our own, though this is something to avoid if possible.
Alternative Approaches
Do nothing
The “do nothing” option will slow the development of the following features:
- IDE integration (code completion)
- Extending R’s script based configuration with Guild file based configuration
- Improved development and debugging support for plugins
Python based typing
Guild could formalize project configuration using Python data structures and Python’s type system. However, as Guild moves to become a correctly designed language neutral tool, it ought to rely less on Python-specific conventions and leverage cross language facilities where possible.