Runs vs storing models

What is the best practice for storing models resulting from different runs?

I made a pipeline that after training stores model in appropriate file in models directory. In my guild.yml file I added models in requires section.

That seems to result in runs overwriting files in models directory.

I’ve checked models and packages option for guild file, but it still doesn’t answer my question - I don’t see what to use to get different artifacts from different runs.

Excellent question — so much so that I created a detailed example to help answer.

TLDR; you need to either use target-type: copy to avoid linking to the upstream directories, or, IMO better, explicitly control the inputs and outputs to your operations to avoid accidentally overwriting upstream run files. The example link above shows this in detail.

This case underscores a flaw in Guild, which is that is doesn’t do anything to prevent this sort of accident. Guild needs to set read-only flags on run-generated files as a minimal measure of protection. This is on the roadmap but I’ll make sure it is bumped in priority as this really bothers me, as it should everyone. I’m sorry you ran into this.

I’d answer here but the example I think shows in detail all the ins and outs and provides working/testable code so I’ll point you there.

1 Like