Hi,
I am trying to use Guild for hyperparameter optimization. I am running max-trials of 50 and want to store these temporary models in the /scratch drive on the Linux cluster. I checked that the drive is mounted corrected and I am able to read and write properly in the drive. However, when i submit my guild run, I get the following error:
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.6/dist-packages/guild/plugins/skopt_gp_main.py", line 77, in <module>
main()
File "/usr/local/lib/python3.6/dist-packages/guild/plugins/skopt_gp_main.py", line 30, in main
skopt_util.handle_seq_trials(batch_run, _suggest_x)
File "/usr/local/lib/python3.6/dist-packages/guild/plugins/skopt_util.py", line 210, in handle_seq_trials
_run_seq_trials(batch_run, suggest_x_cb)
File "/usr/local/lib/python3.6/dist-packages/guild/plugins/skopt_util.py", line 234, in _run_seq_trials
batch_flag_vals,
File "/usr/local/lib/python3.6/dist-packages/guild/plugins/skopt_util.py", line 266, in _iter_seq_trials
prev_trials = prev_trials_cb()
File "/usr/local/lib/python3.6/dist-packages/guild/plugins/skopt_util.py", line 224, in <lambda>
prev_trials_cb = lambda: batch_util.trial_results(batch_run, [objective_scalar])
File "/usr/local/lib/python3.6/dist-packages/guild/batch_util.py", line 404, in trial_results
return trial_results_for_runs(trial_runs(batch_run), scalars)
File "/usr/local/lib/python3.6/dist-packages/guild/batch_util.py", line 408, in trial_results_for_runs
index = _run_index_for_scalars(runs)
File "/usr/local/lib/python3.6/dist-packages/guild/batch_util.py", line 423, in _run_index_for_scalars
index = indexlib.RunIndex()
File "/usr/local/lib/python3.6/dist-packages/guild/index.py", line 314, in __init__
self._db = self._init_db()
File "/usr/local/lib/python3.6/dist-packages/guild/index.py", line 323, in _init_db
self._init_tables(db)
File "/usr/local/lib/python3.6/dist-packages/guild/index.py", line 349, in _init_tables
"""
sqlite3.OperationalError: disk I/O error
I checked my /scratch drive and found that the runs and cache folders are created. Also found that one folder was created inside runs. But this folder was empty.
The same command works perfectly when I run using my GUILD_HOME as a different drive. I am not sure if I am missing anything here.
I would appreciate any help from your side.
Thanks,
Vishal