Let say I have 2 operations preprocessing and training.
Each has a series of flags that I want to do a hyper parameter search over.
Here is a rough idea of what my operations would look like
Neglecting the practicality of this scenario, I want to run my preprocessing operation every time I run train. Is there a way to achieve this with pipelines or a similar method? I know I could do
and this would solve my problem, but when both operations have a lot of parameters it would end up requiring a lot of copying and pasting between the pipeline and operation flags. Is there a better/more efficient way to achieve this? I basically want the pipline to “inherit” the flags of the steps its performing if possible. The documentation seems to hint at something like this
But when I create a pipeline with no flags and attempt to pass a flag which is defined by one of the steps I get an error saying that the flag does not exists. Is what I am trying to do possible with Guild?
I started to think the copy/paste solution wouldn’t actually solve my problem because I also needed an additional run that I could only run once before preprocessing and train. Since it is outside the pipeline my steps cannot access it without isolate-runs. However this requires using the run: operation attribute in my step
I found this by accident and couldn’t find anything regarding it in the documentation. I figured I would leave it here incase someone has a similar issue. A reference to where this behavior is mentioned would be helpful.
Right now I have around 10 and could potentially use more in the future.
Being able to access the step flags from the pipeline would defiantly help a lot.
As another solution I tried using $include and was also having issues. I made two configs for train and preprocessing and then included both in the pipeline
but it only detected the flags for the last $include attribute. In this case train-flags. Can only one $include be used or am I possibly doing something wrong?
It is - sorry this is indeed taking too long to get in. Looking at this for the next major release (0.8) along with DvC integration, generalized run query support (there’s an excellent PR waiting for merge related to tags but I want to see generalized flexible query support across all attrs).