I have been using guild with slurm directly but currently there’s a problem when calling guild compare (the NAS were mounted at /home so the .guild folder is also passed to the child node), it cannot grab the pid of process running in the child node so it displays them as terminated and is slightly inconvinient.
I was looking at remote setting but I think remote setting will pass the job to a remote server to do instead of invoking it locally. I wonder if anyone have a good solution to this?
This won’t be easy to solve without making a change to Guild itself. Guild needs to know that a run is managed by another host/node. This is going to be an issue with any shared volume.