NameError: guild doesnt recognize defined variable?

I tried to train a random forest using guildai. Here is my script:

It works as expected if I jsut execute the script loccaly. But if I run it through guild:

`guild run --yes --max-trials 2 random_forest:train num_threads=4 tb_volatility_scaler=[1,1.5,2] ts_look_forward_window=[1200,2400] sample_weights_type=[returns,time_decay,trend_scanning] max_depth=range[2:6:1] max_features=range[5:100:5] n_estimators=range[50:1000:50] min_weight_fraction_leaf=range[0.01:0.1:0.01] class_weight=[balanced_subsample,balanced]`
it returns an error:
(base) PS C:\Users\Mislav\Documents\GitHub\trademl> guild run --yes --max-trials 2 random_forest:train num_threads=4 tb_volatility_scaler=[1,1.5,2] ts_look_forward_window=[1200,2400] sample_weights_type=[returns,time_decay,trend_scanning] max_depth=range[2:6:1] max_features=range[5:100:5] n_estimators=range[50:1000:50] min_weight_fraction_leaf=range[0.01:0.1:0.01] class_weight=[balanced_subsample,balanced]
e[33mWARNING: Could not parse requirement: -umpye[0m
e[33mWARNING: Could not parse requirement: -illowe[0m
e[33mWARNING: Could not parse requirement: -umpye[0m
e[33mWARNING: Could not parse requirement: -illowe[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\data\sampe_Data.csv because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\trademl\modeling\random_forest\X_TEST.csv because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\trademl\modeling\random_forest\nn_sample.csv because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\trademl\modeling\random_forest\rf_model.json because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\trademl\modeling\random_forest\rf_model_25.json because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\trademl\modeling\random_forest\rf_model_25_ts.json because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
INFO: [guild] Running trial c8f45a92942845fa9ef7fc8d5d8afd73: random_forest:train (class_weight=balanced, cv_number=4, labeling_technique=triple_barrier, max_depth=2, max_features=70, min_weight_fraction_leaf=0.02, n_estimators=950, num_threads=4, sample_weights_type=trend_scanning, structural_break_regime=all, tb_triplebar_min_ret=0.004, tb_triplebar_num_days=10, tb_volatility_lookback=50, tb_volatility_scaler=2, ts_look_forward_window=2400)
INFO: [numexpr.utils] Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
INFO: [numexpr.utils] NumExpr defaulting to 8 threads.
2020-07-21 21:58:55.448030 100.0% apply_pt_sl_on_t1 done after 0.19 minutes. Remaining 0.0 minutes.
dropped label:  0 0.00015123254524373645
'fit' took 53.25 seconds to run.
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\.guild\runs\c8f45a92942845fa9ef7fc8d5d8afd73\.guild\sourcecode\trademl\modeling\train_rf.py", line 202, in <module>
    sample_weight_train=sample_weigths,
NameError: name 'sample_weigths' is not defined

It says sample_weight is not defined, but it is defined in the script and it works if I just execute it inside my IDE.

OS: WIndows
IDE: VSCode
guild version: 0.7.0.rc11

There’s a case there the variable sample_weigths is not initialized:

You want an else clause in there. If there isn’t a legit value in that case, use an assertion to catch the problem before it gets out of the init block.

if a == 1:
  b = "one"
elif a == 2:
  b = "two"
else:
  assert False, a

Guild is likely setting sample_weights_type to something that you’re not checking for, which is why you’re seeing different behavior when running with Guild.

Sorry @garret, I have found ‘the bug’, last one should be

elif sample_weights_type == 'trend_scanning':

not

elif labeling_technique == 'trend_scanning':
1 Like

@garrett, do you know what can be the reason for guild not to save scalars in my same script? I saved them before.

Make sure your output scalar config is picking up the script output as expected. Use --test-output-scalars on some sample run output to see how the rules are applied. Check out the Guild File Reference for pointers.

You can evaluate the output of your latest run with this command:

guild cat --output | guild run train --test-output-scalars -

The Command Cheatsheet has some other examples to help debug operations.

@garrett, I have just discovered that guild doesn’t save scalars when I use option:

$Env:NO_RUN_OUTPUT=1 

I use this option because you recommended it in this thread: Timeout error

But above option really solved my timeout error problem.

Ah, yes indeed - good catch!

The two options are to a) see if that timeout problem still occurs. As I’ve mentioned that’s a real bear of a problem if it can’t be recreated consistently. IIRC your project uses multi-threading libraries and getting multi-threading processes to run reliably is something humankind has been fighting for for a long time.

Option b), which is I think what you should consider at this point, is to log the scalar values explicitly using one of the various logging libraries for TF summaries. See this section for help.

I don’t have experience with TensorBoard Summaries, but I will check out tensorBoardX since I have various models to test.

Do you have example hoe to implement it with guild?

You bet! Here are some links:

I don’t recall if your project uses Pytorch but this lib will be available to you without having to install yet-another-dependency (tensorboardX). Otherwise tensorboardX is pretty light and works great!

I tried with tensorboardx writer and it works. Thanks.

1 Like