Cannot open Tensorboard with Guild - unhashable type 'Dict'

Hi,
First of all, great job, loving Guild so far.

I have successfully created a queue, staged runs and completed them. I can run ‘guild view’ or ‘tensorboard’ standalone and see the logs.

However, I cannot run ‘guild tensorboard’ or similary start tensorboard from guild view. In both cases I get an error with the following traceback:

(anomaly_detection) E:\source\repos\anomaly_simulation\Zoo>guild -H E:/source/repos/anomaly_simulation/Results tensorboard
Preparing runs for TensorBoard
Traceback (most recent call last):
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\blepo\Anaconda3\envs\anomaly_detection\Scripts\guild.exe\__main__.py", line 7, in <module>
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main_bootstrap.py", line 40, in main
    _main()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main_bootstrap.py", line 66, in _main
    guild.main.main()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main.py", line 33, in main
    main_cmd.main(standalone_mode=False)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\click_util.py", line 213, in fn
    return fn0(*(args + (Args(**kw),)))
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard.py", line 108, in tensorboard
    tensorboard_impl.main(args)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard_impl.py", line 46, in main
    _run_tensorboard(args)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard_impl.py", line 94, in _run_tensorboard
    monitor.run_once(exit_on_error=True)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\run_util.py", line 80, in run_once
    runs = self.list_runs_cb()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 119, in f
    _ensure_hparam_experiment(runs, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 134, in _ensure_hparam_experiment
    hparams = _experiment_hparams(runs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 146, in _experiment_hparams
    hparams.setdefault(name, set()).add(val)
TypeError: unhashable type: 'dict'

Any idea what may be causing it and what is the solution?

My system is Windows 10, Guild version is 0.7.2, and Python is 3.8.5.

1 Like

My apologies for the late reply! (I had started a reply but never sent it!)

This is a bug in Guild. One of your runs has a map/dict flag value, which Guild is not correctly handling when setting up the hparam info for TensorBoard. You should be able to work around this by using the --skip-hparams option when running the tensorboard command:

guild tensorboard --skip-hparams

You can alternatively omit the applicable run using a filter.

I opened an issue for this here. This will be fixed in the next release.

Thanks for the report and sorry again for getting back so late!

1 Like

Thanks for the reply and no worries, it’s still very fast support :wink:

On Ubuntu both filtering and using --skip-hparams works.

On Windows 10, however, after using the filtering command I get the following error:

(anomaly_detection) E:\source\repos\anomaly_simulation\Zoo>guild -H E:\source\repos\anomaly_simulation\Zoo\Results tensorboard --tag binary
Preparing runs for TensorBoard
Traceback (most recent call last):
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\blepo\Anaconda3\envs\anomaly_detection\Scripts\guild.exe\__main__.py", line 7, in <module>
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main_bootstrap.py", line 40, in main
    _main()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main_bootstrap.py", line 66, in _main
    guild.main.main()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main.py", line 33, in main
    main_cmd.main(standalone_mode=False)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\click_util.py", line 213, in fn
    return fn0(*(args + (Args(**kw),)))
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard.py", line 108, in tensorboard
    tensorboard_impl.main(args)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard_impl.py", line 46, in main
    _run_tensorboard(args)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard_impl.py", line 94, in _run_tensorboard
    monitor.run_once(exit_on_error=True)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\run_util.py", line 89, in run_once
    self._refresh_logdir(runs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\run_util.py", line 97, in _refresh_logdir
    self.refresh_run_cb(run, path)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 275, in f
    return _refresh_run(run, run_logdir, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 281, in _refresh_run
    _refresh_tfevent_links(run, run_logdir, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 292, in _refresh_tfevent_links
    _init_tfevent_link(tfevent_path, link, run, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 306, in _init_tfevent_link
    _init_hparam_session(run, link_dir, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 314, in _init_hparam_session
    _add_hparam_experiment(state.hparam_experiment, writer)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 328, in _add_hparam_experiment
    writer.add_hparam_experiment(hparams, metrics)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\summary.py", line 125, in add_hparam_experiment
    self._add_summary(_HParamExperiment(hparams, metrics))
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\summary.py", line 108, in _add_summary
    self._get_writer().add_event(event)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\summary.py", line 99, in _get_writer
    self._writer = self._writer_init()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\summary.py", line 92, in <lambda>
    self._writer_init = lambda: EventFileWriter(
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\summary.py", line 57, in __init__
    self._writer = tensorboard.AsyncWriter(
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\plugins\tensorboard.py", line 70, in AsyncWriter
    event_file_writer.RecordWriter(open(filename, "wb")), max_queue_size, flush_secs
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\blepo\\AppData\\Local\\Temp\\guild-tensorboard-4xhgkuo_\\52d15085 ResNet_train 2021-03-15

When using the --skip-hparams, the error is as follows:

(anomaly_detection) E:\source\repos\anomaly_simulation\Zoo>guild -H E:\source\repos\anomaly_simulation\Zoo\Results tensorboard --skip-hparams
Preparing runs for TensorBoard
Traceback (most recent call last):
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\util.py", line 641, in _windows_symlink
    subprocess.check_output(args, shell=True, stderr=subprocess.STDOUT)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\subprocess.py", line 411, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['mklink', 'C:\\Users\\blepo\\AppData\\Local\\Temp\\guild-tensorboard-0mf82txu\\52d15085 ResNet_train 2021-03-15 19_31_12 binary binarize=yes dev=no dimensionality=125 horizon=1 learning_rate=0.001 n_feature_maps=64 optimizer=adam window=100\\.guild\\events.out.tfevents.1615862999.blez-au.17693.0', 'E:\\source\\repo
s\\anomaly_simulation\\Zoo\\Results\\runs\\52d15085552c457e92147334e33664de\\.guild\\events.out.tfevents.1615862999.blez-au.17693.0']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\blepo\Anaconda3\envs\anomaly_detection\Scripts\guild.exe\__main__.py", line 7, in <module>
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main_bootstrap.py", line 40, in main
    _main()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main_bootstrap.py", line 66, in _main
    guild.main.main()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main.py", line 33, in main
    main_cmd.main(standalone_mode=False)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\click_util.py", line 213, in fn
    return fn0(*(args + (Args(**kw),)))
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard.py", line 108, in tensorboard
    tensorboard_impl.main(args)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard_impl.py", line 46, in main
    _run_tensorboard(args)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard_impl.py", line 94, in _run_tensorboard
    monitor.run_once(exit_on_error=True)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\run_util.py", line 89, in run_once
    self._refresh_logdir(runs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\run_util.py", line 97, in _refresh_logdir
    self.refresh_run_cb(run, path)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 275, in f
    return _refresh_run(run, run_logdir, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 281, in _refresh_run
    _refresh_tfevent_links(run, run_logdir, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 292, in _refresh_tfevent_links
    _init_tfevent_link(tfevent_path, link, run, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 308, in _init_tfevent_link
    util.symlink(tfevent_src, tfevent_link)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\util.py", line 625, in symlink
    _windows_symlink(target, link)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\util.py", line 643, in _windows_symlink
    raise OSError(e.returncode, e.output.decode(errors="ignore").strip())
PermissionError: [Errno 1] The system cannot find the path specified.

I’m running commands from PyCharm both on Ubuntu and Windows 10.

I’m ok with using it on Ubuntu, and in the future I will just not use dictionaries. But I thought it may be worth to let you know of the issues on Windows :slight_smile:

I’ll take a look at this on Windows—it looks like there might be something else going on there.

The latest release candidate went out last night, which has a fix for the dict support. You should be able to use TensorBoard without the --skip-hparams or filtering workarounds now. You’ll need to use the --pre option when upgrading Guild:

pip install --upgrade --pre guildai

You should get 0.7.3rc2, which has the fix.

1 Like

Brilliant, thanks once again :slight_smile:

1 Like

The issue here I think are unrelated to the flag value type support (original issue). I confirmed that the fix in rc2 is working on Windows.

I think you may be running into something related to your file system on the Windows machine. I’m not able to recreate that error because I think our setups may be different.

A couple questions:

  • Do you see the same errors when running from a standard Windows Command Prompt (rather than the PyCharm terminal?)

  • Is the current directory on a network drive or otherwise mounted on a different path from what appears as the terminal cwd (e.g. under a symlinked dir, etc.)?

I forgot to say that I’m using Anaconda both on Ubuntu and Windows.

I think you are correct, because upgrading guild to rc2 didn’t fix the issue in PyCharm terminal. As you said, there is no longer need to filter the samples. However, the same error as before happens, with the ‘cannot find the path’.

When running

(anomaly_detection) >guild -H E:\source\repos\anomaly_simulation\Zoo\Results tensorboard

directly in Anaconda Prompt the error is similar:

Preparing runs for TensorBoard
Traceback (most recent call last):
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\blepo\Anaconda3\envs\anomaly_detection\Scripts\guild.exe\__main__.py", line 7, in <module>
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main_bootstrap.py", line 40, in main
    _main()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main_bootstrap.py", line 66, in _main
    guild.main.main()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\main.py", line 33, in main
    main_cmd.main(standalone_mode=False)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\click_util.py", line 213, in fn
    return fn0(*(args + (Args(**kw),)))
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard.py", line 108, in tensorboard
    tensorboard_impl.main(args)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard_impl.py", line 44, in main
    _run_tensorboard(args)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\commands\tensorboard_impl.py", line 86, in _run_tensorboard
    monitor.run_once(exit_on_error=True)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\run_util.py", line 89, in run_once
    self._refresh_logdir(runs)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\run_util.py", line 97, in _refresh_logdir
    self.refresh_run_cb(run, path)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 282, in f
    return _refresh_run(run, run_logdir, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 288, in _refresh_run
    _refresh_tfevent_links(run, run_logdir, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 299, in _refresh_tfevent_links
    _init_tfevent_link(tfevent_path, link, run, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 313, in _init_tfevent_link
    _init_hparam_session(run, link_dir, state)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 321, in _init_hparam_session
    _add_hparam_experiment(state.hparam_experiment, writer)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\tensorboard.py", line 335, in _add_hparam_experiment
    writer.add_hparam_experiment(hparams, metrics)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\summary.py", line 125, in add_hparam_experiment
    self._add_summary(_HParamExperiment(hparams, metrics))
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\summary.py", line 108, in _add_summary
    self._get_writer().add_event(event)
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\summary.py", line 99, in _get_writer
    self._writer = self._writer_init()
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\summary.py", line 92, in <lambda>
    self._writer_init = lambda: EventFileWriter(
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\summary.py", line 57, in __init__
    self._writer = tensorboard.AsyncWriter(
  File "c:\users\blepo\anaconda3\envs\anomaly_detection\lib\site-packages\guild\plugins\tensorboard.py", line 69, in AsyncWriter
    record_writer = event_file_writer.RecordWriter(open(filename, "wb"))
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\blepo\\AppData\\Local\\Temp\\guild-tensorboard-koqw_bh0\\52d15085 ResNet_train 2021-03-15 19_31_12 binary binarize=yes dev=no dimensionality=125 horizon=1 learning_rate=0.001 n_feature_maps=64 optimizer=adam window=100\\.guild\\events.out.tfevents.0000000000.hparams'

The Anaconda environment is placed on drive C, while the code and Guild home are on E. I don’t know if that’s relevant, but in the traceback above guild/tensorboard look for the runs in a weird directory.

That’s helpful information! I’ll play around with Anaconda and see if that might be the issue. I suspect there’s a path link involved here that’s causing issues on Windows. The cross drive/volume issue is a good data point.

I have encounter this too, but the problem was because I included special characters in the HParams (e.g. labels, tags …etc). So far I notice “{}./” might cause problem.

Basically, I changed the code a little bit to work arround this:

# line 142 of ...python/lib/site-packages/python3.7/guild/tensorboard.py
def _experiment_hparams(runs):
    hparams = {}
    for run in runs:
        for name, val in (run.get("flags") or {}).items():
            # ==Add this these lines==
            if isinstance(val, dict):
                val = str(val)
            # ==end==
            hparams.setdefault(name, set()).add(val)
        hparams.setdefault(SOURCECODE_HPARAM, set()).add(_run_sourcecode(run))
    return hparams

My guess is there’s an eval() some where that convert the string to values for putting them into hparams properly, and the labels are turned into value, list or dicts, but I was too lazy to dig into the code

Awesome! Yep, that’s the right approach. This fix is available in rc2, which is available as a pre-release.