An effective way to export some of the data generated in a run to a designated location?

teracamo · January 14, 2021, 8:11am

So I generate a bunch of results in multiple runs of the same model (I like to keep them in the run subdir because it helps me with output data versioning). However, when it comes to the time that I need to analyse them, I always have to write an extra bit of code to copy from the selected runs. Therefore, I am looking for more elegant solution to do so.

I also tried guild export but it either copy all resources or none and I still need to manually collect them from the exported run subdirs to the same directory. Furthermore, guild export of a pipeline will only copy the source-code, leaving behind the results generated in its steps.

So I used to do something like:

for runid in ID1 ID2..
do 
  mkdir -p /somewhere/else/${runid} 
  cp -rv `guild open ${runid} --cmd='echo' --path=relative/output/dir`* /somewhere/else/${runid}
done

I wonder if there’s a more elegant way to do this, which could involve something like guild cp RUNID --path=relative/paths LOCATION. Using guild open also feels a little bit odd in this scenario.

garrett · January 14, 2021, 2:53pm

Yes, open is a nice hack there!

Guild supports select, which could be modified to support a --cmd option, similar to open but with access to env vars like $RUN_DIR to support something like this:

guild select --cmd "cp -a $RUN_DIR/relative/output/dir /somewhere/else/$RUN_ID"

This follows the pattern that find uses with --exec. In fact I could see renaming guild select to guild find.

Currently select only returns the first matching run but this would be modified (there’s a feature request for this).

This is not quite the same as a cp command — but it’s more versatile. E.g. see man cp for some of the options that Guild would have to potentially implement, and that across platforms.

teracamo · January 15, 2021, 5:41am

Thanks! That’s THE feature that I am looking for. I would definitely like to see that happen.

Topic		Replies	Views
Command: export Commands	0	1628	June 10, 2020
Command: runs export Commands	0	863	June 10, 2020
Export runs to pandas df inside pyton General	3	625	September 10, 2020
Runs vs storing models General	1	424	November 25, 2020
Exporting guild runs from remote sever Troubleshooting	3	725	November 25, 2020

An effective way to export some of the data generated in a run to a designated location?

Related topics