I tried to train a random forest using guildai. Here is my script:
It works as expected if I jsut execute the script loccaly. But if I run it through guild:
`guild run --yes --max-trials 2 random_forest:train num_threads=4 tb_volatility_scaler=[1,1.5,2] ts_look_forward_window=[1200,2400] sample_weights_type=[returns,time_decay,trend_scanning] max_depth=range[2:6:1] max_features=range[5:100:5] n_estimators=range[50:1000:50] min_weight_fraction_leaf=range[0.01:0.1:0.01] class_weight=[balanced_subsample,balanced]`
it returns an error:
(base) PS C:\Users\Mislav\Documents\GitHub\trademl> guild run --yes --max-trials 2 random_forest:train num_threads=4 tb_volatility_scaler=[1,1.5,2] ts_look_forward_window=[1200,2400] sample_weights_type=[returns,time_decay,trend_scanning] max_depth=range[2:6:1] max_features=range[5:100:5] n_estimators=range[50:1000:50] min_weight_fraction_leaf=range[0.01:0.1:0.01] class_weight=[balanced_subsample,balanced]
e[33mWARNING: Could not parse requirement: -umpye[0m
e[33mWARNING: Could not parse requirement: -illowe[0m
e[33mWARNING: Could not parse requirement: -umpye[0m
e[33mWARNING: Could not parse requirement: -illowe[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\data\sampe_Data.csv because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\trademl\modeling\random_forest\X_TEST.csv because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\trademl\modeling\random_forest\nn_sample.csv because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\trademl\modeling\random_forest\rf_model.json because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\trademl\modeling\random_forest\rf_model_25.json because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
e[33mWARNING: Skipping potential source code file C:\Users\Mislav\Documents\GitHub\trademl\trademl\modeling\random_forest\rf_model_25_ts.json because it's too big. To control which files are copied, define 'sourcecode' for the operation in a Guild file.e[0m
INFO: [guild] Running trial c8f45a92942845fa9ef7fc8d5d8afd73: random_forest:train (class_weight=balanced, cv_number=4, labeling_technique=triple_barrier, max_depth=2, max_features=70, min_weight_fraction_leaf=0.02, n_estimators=950, num_threads=4, sample_weights_type=trend_scanning, structural_break_regime=all, tb_triplebar_min_ret=0.004, tb_triplebar_num_days=10, tb_volatility_lookback=50, tb_volatility_scaler=2, ts_look_forward_window=2400)
INFO: [numexpr.utils] Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
INFO: [numexpr.utils] NumExpr defaulting to 8 threads.
2020-07-21 21:58:55.448030 100.0% apply_pt_sl_on_t1 done after 0.19 minutes. Remaining 0.0 minutes.
dropped label: 0 0.00015123254524373645
'fit' took 53.25 seconds to run.
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\.guild\runs\c8f45a92942845fa9ef7fc8d5d8afd73\.guild\sourcecode\trademl\modeling\train_rf.py", line 202, in <module>
sample_weight_train=sample_weigths,
NameError: name 'sample_weigths' is not defined
It says sample_weight is not defined, but it is defined in the script and it works if I just execute it inside my IDE.
OS: WIndows
IDE: VSCode
guild version: 0.7.0.rc11