In my project I have a bit of automatic tuning of my pytorch-lightning models using ray and then I also automatically apply the model. The logging that ray[tune] uses is a SummaryWriter from tensorboardX package. I am also using tensorboardX SummaryWriter for logging other things in my project. For my own logging, there is no issue with this, but for some reason guild fails with the calls to add_scalar()
when it’s called from the tune library.
The trace:
3/8/2021 5:40:52 PM
Traceback (most recent call last):
3/8/2021 5:40:52 PM
File "/home/davina/miniconda3/envs/ap/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 594, in _process_trial
3/8/2021 5:40:52 PM
decision = self._process_trial_result(trial, result)
3/8/2021 5:40:52 PM
File "/home/davina/miniconda3/envs/ap/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 666, in _process_trial_result
3/8/2021 5:40:52 PM
self._callbacks.on_trial_result(
3/8/2021 5:40:52 PM
File "/home/davina/miniconda3/envs/ap/lib/python3.8/site-packages/ray/tune/callback.py", line 192, in on_trial_result
3/8/2021 5:40:52 PM
callback.on_trial_result(**info)
3/8/2021 5:40:52 PM
File "/home/davina/miniconda3/envs/ap/lib/python3.8/site-packages/ray/tune/logger.py", line 393, in on_trial_result
3/8/2021 5:40:52 PM
self.log_trial_result(iteration, trial, result)
3/8/2021 5:40:52 PM
File "/home/davina/miniconda3/envs/ap/lib/python3.8/site-packages/ray/tune/logger.py", line 631, in log_trial_result
3/8/2021 5:40:52 PM
self._trial_writer[trial].add_scalar(
3/8/2021 5:40:52 PM
File "/home/davina/miniconda3/envs/ap/lib/python3.8/site-packages/guild/python_util.py", line 239, in wrapper
3/8/2021 5:40:52 PM
cb(wrapped_bound, *args, **kw)
3/8/2021 5:40:52 PM
TypeError: _handle_scalar() got an unexpected keyword argument 'global_step'
The line from tune in question in full is self._trial_writer[trial].add_scalar(full_attr, value, global_step=step)
. This fails.
In my own project I have the following line: logger.add_scalar(f"{prefix}/{tag}", scalar_value, global_step, walltime)
and this does not fail.
So I went into the ray.tune library and I changed the call to self._trial_writer[trial].add_scalar(full_attr, value, step)
and reran it. The failure went away.
I dug into the github source, and it looks like _handle_scalar()
is expecting step
and not global_step
.
I originally needed help with this but as I wrote this I ended up figuring out the answer. Looks like there’s a potential bug here?