This proposal seeks to expand Guild’s support of scalar log formats beyond TF events.
This proposal is under development
While Guild supports output scalars, which let users print scalar values to standard output, it otherwise requires users to log scalar values to TF event files. The TF event format is purpose built for logging scalars (floats associated with a tag/key and a step value) and is compatible with TensorBoard.
This scheme restricts users, however. Other log formats might include CSV files, JSON, Hydra, and HDF5. Furthermore, users may use esoteric libraries for logging to databases (e.g. common experiment tracking loggers).
Guild should be adaptable to these formats by way of plugins.
Guild will extend its plugin API to support reading of scalars from a run directory. Guild will use this facility to scan a run directory for scalars, caching values per source as it does today with TF event files.
Guild has gotten by with the TF event format and output scalars for quite some time, so the “do nothing” option is not unreasonable. However, it restricts users to workarounds via output scalars or duplicating scalar logging, either by patching an existing logging scheme or by post-run processing (which is subject to catastrophic loss on process death).
Patch logging schemes at runtime
Guild could add support for alternative logging formats by patching the logging facility at runtime, mirroring the values in TF event files.
There is little upside in this approach and considerable downside:
- Brittleness of API patching
- Limited to runtime environments that support patching
- Subject to loss on process death
- Impact on runtime performance
This approach runs counter to Guild’s philosophy that the run directory is the system of record. Guild has established a successful pattern of lazily reading and caching-as-needed run artifacts as they are (e.g. lazy creation of log directories for images and hyperparameters in TensorBoard).