I have some code that works standalone, but fails when run from guild. The offending line is:
torch.multiprocessing.spawn(main_worker, nprocs=n_gpus, args=(n_gpus, args))
and the complaint is:
[...]
File "/usr/lib/python3.10/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/usr/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
return Popen(process_obj)
File "/usr/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/usr/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/usr/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "/usr/lib/python3.10/multiprocessing/spawn.py", line 183, in get_preparation_data
main_mod_name = getattr(main_module.__spec__, "name", None)
AttributeError: 'dict' object has no attribute '__spec__'
Does anyone have any tips? I’m not sure I really understand what’s failing in the spawn call… Thanks!