I would like to make prepare data file that is the same for several models.
In the documentation on pipelines (Pipelines) you show how to construct the pipeline with the prepare data file in the same model. But I would like to have prepare steps for several models (for example decision tree and random forest.)
I would like to know what is recommended way to do this inside guildai.
First thing that come to my mind is to have prepare file and source it from train script, but I am not sure if I can normaly call flags from that prepare file when running train operation from guild.
Do you want to run
prepare-data once and use that one prepared data set for all models? Or you want to run separate prepare operations, each one creating a separate data set for use by each different model type?
The example shows a single data set that’s used by both
test for a single model type. But these operations could just as easily be
train-random-forest. The interface would be the same.
I don’t quite follow your last paragraph. From that it sounds like each model has its own prepare logic.
@garrett, I want prepare for multiplie models (maybe even all models, I will see).
If I get it right you recommend following structure:
- model: multie_models
Yes that’s right. Though you could optionally use
model objects to better represent what you’re doing there.
Here’s an example:
You can run the
pipeline operation to run the operations end-to-end.
If you prefer to use only operations, use this:
- operation: prepare-data
I see config for the first time.
Why do I need the config object in the first place? For example what if I remove config (and extend in models) and just define operations and models objects?
In above yml you define prepare-data operation 2 times, in config and than again in operations, I don’t see the reason for this?
In this case
config is used to define shared resources. The
prepare-data operation is defined once in both examples.
It’s subtle, but
prepared-data (notice the difference in naming convention) is the name of a resource. This is spelled this way so that
requires: prepared-data reads better. You can name it whatever you want.
How should I run prepare step only now?
guild run prepare-data deosnt work
How does it not work? You can clone the repo above and run the sample to confirm.
ERROR: error in C:\Users\Mislav\Documents\GitHub\trademl\guild.yml: invalid value for operation ‘None’ ‘prepare’: expected a string or a mappinge[0m
I’d need to see the Guild to help here.
Delete this line:
Also, it’s spelled
arg_name. Guild should complain about that but it doesn’t.