Pipelines

guildai · June 12, 2020, 10:26pm

Overview

The term pipeline in this section refers to running multiple operations to accomplish a goal.

Guild pipelines are implemented using higher-level operations, which define sub-operations using a steps attribute.

Consider the following three operations:

prepare-data:
  description: Prepare data for train and test

train:
  description: Train a model
  requires:
    - operation: prepare-data

test:
  description: Test a model
  requires:
    - operation: prepare-data
    - operation: train

The pipeline implied by these operations is:

Run prepare-data to generate a data set that can be used for train and test.
Run train to train a model on prepared data.
Run test to test a model on prepared data.

You can run these operations manually as a part of your development process. For example, you might run prepare-data different times, each with different flags to experiment with different processing stages, feature engineering, etc. After each prepare-data you may run train to explore model validation performance. After iterating over data preparation and training, finally, you run test to measure your results against hold-out data set.

This process represents an ad-hoc pipeline.

When you want to automate a sequence of operations, create a higher-order operation using steps.

Consider a new operation pipeline:

pipeline:
  description: Runs model pipeline end-to-end
  steps:
   - prepare-data
   - train
   - test

When you run pipeline, Guild starts a higher-order operation, which runs each of the specified steps in the order specified.

Step Flag Values

Specify flag values for a step in one of two ways:

Arguments to the step operation name
flags attribute of a run object

For example, this examples uses flag arguments:

pipeline:
  steps:
   - train lr=0.01 epochs=100
   - test

This example uses a flags step attribute:

pipeline:
  steps:
    - run: train
      flags:
        lr: 0.01
        epochs: 100
    - test

Expose Step Flag Values

To expose a step flag value to the user, define a flag for the pipeline operation and explicitly use it as a step flag.

pipeline:
  flags:
    train-lr: 0.01
    train-epochs: 100
  steps:
    - train lr=${train-lr} epochs=${train-epochs}
    - test

Step Run Attributes

You may define a number of step run attributes when using step mapping. Refer to Guild File Reference for details.

Topic		Replies	Views
Flags sharing through operations General	15	1507	July 30, 2020
Prepare data file General	12	790	July 28, 2020
How can I define models in guild and run them against different training procedures? General	1	538	March 22, 2022
Command: run Commands	1	5183	February 16, 2021
Operations Concepts	0	2470	June 13, 2020

Pipelines

Overview

Step Flag Values

Expose Step Flag Values

Step Run Attributes

Related topics