I sometimes see guild taking a long time to start a run compared to just running the command that I get from --print-cmd. I realise this is because guild has to resolve dependencies etc., but I would like to understand if there is an easy way to debug and especially profile what steps / operations that is expensive in the guild command.
I am aware of the guild --debug flag, but in my particular case it doesn’t provide much info about what is taking a long time.
It looks like you have a directory with a lot of files - over 1M. Guild is example those files to see if they’re candidates for source code copy. By default Guild only looks at I think around 100 files unless you’ve configured the sourcecode attr for the operation.
You can see what’s going on by running:
guild run <op> --test-sourcecode
This should take all that time but you’ll see where the files are.
You can remove a directory from consideration (Guild won’t scan it) this way:
op:
sourcecode:
- exclude:
dir: <dir containing lots of files>
Is what takes a long time. The interesting thing is that the sourcecode directory in the run directory only contains the sourcecode that I have specified.
$ ls ~/.../.guild/runs/5d239a67d97d4bd4952e2b1cc2b10083/.guild/sourcecode/
guild.yml scripts training
Refer the example I provided above. You need to explicitly exclude any directories containing large numbers of files - unless you want those scanned for consideration as source code files. This is what’s taking time. The code snippet above will address that.