Flags sharing through operations

I have following guild structure: https://github.com/MislavSag/trademl/blob/master/guild.yml

I would like to be able to run 3 operations:

  1. run only data-prepare with saved flags. This part works as expected.
  2. run only model part without data-prepare step. This works as expected if I remove requires: prepare-data part in model part. I am not sure is it possible to somehow save flags values from data-prepare part even if I didn’t use it in the pipeline with the model operation. For example I do data-prep and than I wnt to test 5 different models with this data-prep, but want to data-prep flags to be saved too. Probbably not?
  3. run data-prep and model operation together. This is done with pipeline. But I don’t understand how can I use flags from data-prep and model operation here? If just use flags from dat prep for example it doesn’t recognize it: guild run pipeline ts_look_forward_window=240

@garrett, sorry for spamming here every dat :slight_smile:

Could you clarify what you mean by “save flags”? Typically a required operation (I’ll call this “upstream” because it occurs before the requiring run, which I call “downstream”) is used for the files it generates. E.g. a data prep operation generates some files that are loaded and used by a training operation.

Do you model operations use the flag values of data prep? Are there any files that the model operations use? What does data prep do in this case?

Guild doesn’t support an optional dependency — one that is satisfied when available but otherwise is ignored. If you define a dependency on an upstream operation, you need to have a run available to satisfy that dependency.

Refer to Step Flag Values for how do pass flags through to step runs. From the look of your Guild file, it’s going to be painful to expose all of those flag values that way. Unfortunately there’s no other way to pass flags through to steps. This is a know issue with Guild that will be fixed in an upcoming release (the ability to pass flag values through without exposing them).

Pipelines are used more for push-button automation with a limited number of flags exposed rather than ad-hoc work, which is why the “expose flags” pattern is generally not onerous. That’s said it’s not particularly popular restriction :slight_smile:

Keep the questions coming! I’m always happy to help.

Save flags of data-prep process. For example data-prep have flag_1. I run data-prep operation with flag_1=1. Than I run model operation only with flag_model_1=2. But I would still want to save flag_1 from the data_prep step. The reason why would I want to call guild on those 2 operations separately is that I want to call data-prep to generate X_train, X_test etc and than try several models on those data-prep configuration.

No, it has it’s own step, but I would just like to know, when I compare runs, what flags were used in data-prep.

Yes, it uses X_train.pkl, X_tet.pkl, etc, that were generated in data-prep.

It imports raw data, does preprocessing and train-test splits (in nutshell).

Ok, I will just remove requires.

I saw those docs, but that means I should again copy-paste all flags from both steps (data-prep and model)?

If you drop the requires for the model ops, you won’t be able to get access to any of the prepared data files. I think you need to have that.

If there’s a case where your model doesn’t need data files, that might be a separate operation. For example, let’s say your model can load and check a network graph for correctness — it doesn’t need any data for that. In that case, I’d create a separate operation, e.g. load-and-validate-graph.

If you have a case where your model op runs fake/sample data, create an upstream operation that provides the prepared fake/sample data. You can do that this way:

prepare-data: ...
prepare-sample-data: ...

    - operation: prepare-data|prepare-sample-data

Regarding steps, yes you’d have to do the extremely annoying work of copying all of those flags and maintaining those. I realize that’s a pain and Guild will fix this in an upcoming release. Until then, maybe consider a subset of flags that you likely change for the pipeline op.

Regarding flag sharing, I updated the example project to, I think, do what you want to do.

Take a look at TESTS.md. This shows the behavior that I think you’re after.

You can run the tests:

guild check -nt TESTS.md

The tests use command substitution which might not work if you’re running a non POSIX shell (e.g. Windows command prompt).

There are two main questions answered by this example:

  1. How do you share data across operations?
  2. How do you show a value-of-interest of an upstream run in the corresponding downstream run?

Treat these topics separately.

There’s only one correct answer to question 1. You share data across operations through files. That’s it. An upstream operation saves files that the downstream operation reads. You connect these files in Guild using dependencies. If you weren’t using Guild, you connect these files by running scripts using common directories or otherwise pointing to shared files.

It doesn’t matter what encoding scheme you use as long as upstream and downstream operations agree. In the example project, I picked JSON. The prepare-data operation saves data set metadata in meta.json. The model ops load that metadata. You can use any scheme you want — Guild doesn’t care.

The remaining question is number 2. When comparing runs, how do you show important values from an upstream run in a downstream run? Guild doesn’t have a good answer for this. It will in an upcoming release.

In the meantime, I hacked flags to show how you can do this today.

Please note that this is a hack — meaning we’re using flags for a purpose they are not intended for. It’s a safe hack, which is why I’m presenting it as an option here.

Technically speaking, the metadata associated with the prepared data are not flags — they are especially not model operation flags. I would describe these as attributes of the upstream run. They may be derived from upstream flag values, but they could also be generated. Whatever they are they’re part of the generated upstream artifacts. The problem is that we want to see what those are whenever we look at one of the downstream runs (model ops) that use these upstream artifacts.

That’s going to take a bit of engineering to get right. But that’s another topic. This is related to this discussion.

In any case, the flag hack is relatively simple and I think gets you your flag sharing scenario while maintaining a correct data interface across runs.

Let me know what you think!

I am confused now. If I have data-prep operation with it’s flags and model operation with its own flags (as in guild file I poseted). How can I run in pipeline those 2 operations sequentially by doing, say, grid search with both flags? For example something like this (this doesnt work):
guild run pipeline flag_from_data-prep=range[1:5:1] flag_from_model=range[1:5:1]
I even tried to add flags manually to the pipeline but I still can’t run the example above.
I think I em spending too much time on this. Is there simpler solution?

I created another pipeline example that you can use to run end-to-end pipelines with search.

You’d run it this way:

guild run pipeline-2 data-type=[cats,dogs] random-forest-depth=range[1:5]

This is a true end-to-end pipeline. You run it for each set of flag values that you’re interest in testing. (Note that you’ll end up with a number of pipeline-2 runs. You can delete those using guild runs rm -o pipeline-2 afterward.)

Another approach, which might be the “just keep it simple” scenario you’re looking for, is to implement all of this with a single operation. That would remove both the need to “share flags” and the need for a pipeline.

If you have one operation that does both data prep and model work, you’ll need to be smart about the data prep. That operation will end being called with the same data prep flags multiple times. I would handle this by maintaining a record of inputs to generated files and then symlinking to those files if they already exist. While that’s painful, it might be less painful than using separate operations.

The reason I wanted separate operations is that I use same data prep for multiple models, so I have to change data-prep files only ones.

I will try to implement your last pipeline-2 approach. If I will have problems I will move to simple as possible, prep and moel in one script.

Now I am not sure even how to link operations from same model.

I have to admit I am discourage with all this. I am going through documentation and examples all over again and can’t do simple things anymore. I can only make the simplest thing working: running one script with flags. If I need something simple as do data preparation in one file and and train in another with grid/random search I find it hard to find such examples. There are lots of great features in the package but I am struggling with ‘simple’ things. Your last solution above for connecting 2 operations seems too complicated when there are lots of flags which I have to define several times (and change later). Again, probably I don’t understand the structure of the package. Your assistance is best I encountered so far and I experienced lots of them. Maybe I will try in the near future again.

Again, I thing it is problem with me, not the package. Only thing that I can do is put everything in one file and run it.

This can definitely be a frustrating process. Most of the examples for Guild are quite simple so it’s hard to apply them to more advanced cases.

In looking over https://github.com/MislavSag/trademl it’s hard for me to understand what you want to run.

Is it possible to run any of these operations with sample data? If you can provide steps in a README that you use to perform the steps, I can get that working in a Guild file for you. Ideally there’s some data to prove the steps. If not I can work around that.

I will try one more time. I have changed guild file again:

This time I don’t want to use config and pipelines since they can’t map flags from particular operations and it is inconvenient, IMHO, to copy paste and manually change flags every time.

I will try this time to construct only random forest and lstm models with two operations: prepare and train. Prepare is the same in those 2 models but I don’t see solution for this. As I said before I tried to make only one prepare file but it lead to complex structures, since config and pipeline doesn’t work as I expected when I was reading docs.

Now, only thing I want to do is to run prepare operation and than train operation. The results of the prepare step are saved to the specific folder and train than reads files from that folder. For example, how can I run something like:

guild models:prepare:train-random-forest structural_break_regime=all  max_depth=range[1:5:1]
guild models:prepare:train-lstm structural_break_regime=all  epochs=[20,25]

I skimed all examples and couldnt find above run. I also checked docs several times and couldn’t find the solution.

If it is not possible, I will just put every model in one file (without multiplie steps) and run it like that.

It’s not clear to me what isn’t working for you.

Your Guild file does not define dependencies, so that’s a problem.

Did you get a chance to look at the example I prepared for you? This is a working example that shows concretely how to accomplish what I think you’re looking for.

Regarding pipelines, you should defer that until you have this basic workflow in place:

  1. Run one prepare op
  2. Run one model training op (e.g. random forest)
  3. Run another model training op (e.g. decision tree)

Run these steps manually. Don’t automate anything until the steps are working by themselves as you expect.

Runs 2 and 3 should use the output from run 1.

Until you can see this process working you’ll be spending a lot of time thinking about non-essential problems.

I suggest that you start with the example that I provide. Run it. Don’t just look it over. Use that as the basis for your work.

You have a lot of flags defined from your operations in the Guild file. You can skip flags for now. I feel this is noise until you can run your operations. See steps 1 - 3 above. You don’t need to get everything working in one pass. Start with something extremely simple (e.g. replicate the steps of the example that I provide) and then make small changes along the way.

When you run into an issue with a small change, it’s easier to solve than “big picture” approaches.

I don’t have problem with running 1 operation. For example guild run prepare-data works and the output is saved files (X_train, X_test…). Command guild run random-forest:train works too. It uses saved files from step one and train the model. Only drawback is that it doesn’t save flags from step one, but that is expected.

Now, the problem i if I want to run step 1 and 2 together with several flags. Pipeline-2 in you example works, but what if I have 20 flags and want to change domain of every flag from run to run. That means I have to change manually pipeline every time and cant define flags from command line?

I don’t see how your model operation gets access to the prepared data. The only way this could work is if you’re saving the data set to a shared/known location.

Regarding the pipeline, let’s just set that aside for now.

Can you run these steps?

  1. Run prepare data - this generates the data sets.
  2. Run a single model op using grid search across whatever flag search space you want - each run uses the data set from step 1.

Is that possible? Or do you need to re-run the prepare data each time?

Regarding “saved flags” I gave you a working example of how to do that. Did you run that and see how it works?

@garrett, yes, I am ‘saving the data set to a shared/known location’, I thought this is the only possibility. How could model use X_train and other files from data-prep? That is the reason I commented requires attribute.

I can run steps 1. and 2. you recommend to try.

I can run pipeline-1 and pipeline 2 too. They are just different than in your example (pipeline-rf and pipeline-rf-opt):

I think it would help to get on the same exact page - with runnable code. The sample project I created is an attempt to share code that I know works. We could take the other approach - you share code with me and I modify it to work as it should/you want it to.

Let me make a suggestion. If you can create a small, separate project that is as simple as possible - and all it does is fake work that is representative of the real work - we can use that to address your issues.

I’m happy to spend the time with you on this. I know you’ve invested a lot of time and I’d like you to benefit from that.

Let’s pick a project on GitHub and use that to make concrete, runnable changes. Once you see things working, I think you’ll feel a lot better and less frustrated.

Actually this code is working now. I didn’t know I can ‘override’ flag values from pipeline.

I am only not sure if my approach of saving files to shared folder is wrong for some reason.

I will create sample data in the same repo so you can reproduce my result if needed.