Specifying nested resources for different models

anandan9470 · September 10, 2020, 6:48am

I have a repository structured as follows -

|--data/data-for-model-1/data.csv
|--data/data-for-model-2/data.csv
|--src/model-1-train.py
|--src/model-2-train.py

my guild file is as follows -

- model: model-1
  operations:
    train:
      main: 'src/model-1-train'
      requires:
        - detector-data
  resources:
    detector-data:
      - file: data/data-for-model-1
    target-type:
      - link

- model: model-2
  operations:
    train:
      main: 'src/model-2-train'
      requires:
        - detector-data
  resources:
    detector-data:
      - file: data/data-for-model-2
    target-type:
      - link

Within model-1-train.py, I am accessing it’s data by reading the path read('data/data-for-model-1/data.csv').

But guild doesn’t like this as specifying data/data-for-model-1 in the guild file creates a symlink starting from data-for-model-1 at the new cwd that guild creates. So from within the python file, you would need to do read('data-for-model-1/data.csv') when running with guild.

So it seems that I need to modify the code depending on whether or not I am using guild to run the experiment.

I can get around this by just not having the data directory and putting data-for-model-1 and data-for-model-2 at the same level as src and always do read('data-for-model-1/data.csv') from within the python file.

Or I can simply do

  resources:
    detector-data:
      - file: data

for both the models. That is fine (and probably what I will do) so long as only a symlink is created and the data is not being copied over.

Is this the only way to ensure that the code can remain unchanged irrespective of whether I am running with guild or not?

PS: Amazing project, love it!

garrett · September 10, 2020, 11:23am

Hi @anandan9470 I’m glad you’re enjoying Guild!

You want to set target-path for a resource or resource source to data.

- model: model-1
  resources:
    detector-data:
      - file: data/data-for-model-1
        target-path: data
        target-type: link

Also you specify target-type as a list there – it should be a string (see above).

For more info you can read about target path in Guild File Reference.

Topic		Replies	Views
How can I define models in guild and run them against different training procedures? General	1	538	March 22, 2022
How to require multiple files from the same directory in guild.yml? General	5	589	May 3, 2023
Data Filepath Flag Troubleshooting	4	549	October 19, 2020
Question about project structure General	17	2127	June 19, 2020
Guild Files Concepts	0	5163	June 12, 2020

Specifying nested resources for different models

Related topics