Forest optimizer example

mislav · July 22, 2020, 11:42am

Do you have an example of how to use forest optimizer inside guildai? I haven’t used skopt functions before. Their official docs site is modest. For example, I would like to try optimizers on my random forest model I posted earlier:

github.com

MislavSag/trademl/blob/master/trademl/modeling/train_rf.py

# fundamental modules
import numpy as np
import pandas as pd
from numba import njit
import matplotlib.pyplot as plt
import matplotlib
import sys
import os
from pathlib import Path
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, BaggingClassifier
from sklearn.model_selection import GridSearchCV
from mlfinlab.ensemble import SequentiallyBootstrappedBaggingClassifier
from sklearn.base import clone
import xgboost
import shap
import mlfinlab as ml
import trademl as tml

This file has been truncated. show original

I assume dimension space (flags I use) shouldn’t be too large (up to six), is that right?

garrett · July 22, 2020, 3:05pm

Specify forest as the optimizer arg like this:

guild run train -o forest

You can set optimizer flags using -Fo like this:

guild run train -o forest -Fo random-starts=5 -Fo xi=0.1

See Optimizers Reference for supported flags.

mislav · July 22, 2020, 4:10pm

I only need to add those extra arguments whatever are my flags? Will this work for all estimators or only tree based?

garrett · July 22, 2020, 4:18pm

Yes, you can freely use different sequential optimizers with the same flag ranges. Each estimator has its own set of flags (see the Optimizers reference link above) but they all work on the same flag ranges. The only difference is the model/algorithm used to perform the optimization.

mislav · July 22, 2020, 4:25pm

Ok thanks. Just to confirm, I can use flags (parameters )that doesn’t have anything to do with the model, right? For example, threshold for removing outliers. or way of constructing labels?

garrett · July 22, 2020, 4:41pm

I’m not totally clear on what you mean by parameters in this case? Guild lets you set any flag value. It doesn’t know if a flag value is associated with a model or not. So you can use flags for anything you like.

For constructing labels, are you talking about Guild run labels? That’s separate from flags. There are various options to run that deal with labels.

Keep in mind that in the context of optimization, there are two operations in play: the trial prototype and the optimizer. That’s discussed in Hyperparameter Optimization.

You set flag values for your trial prototype the normal way (i.e. NAME=VALUE). Those values include the search ranges, choice lists, etc. The optimizer operation uses those values to make decisions about what values to use for each trial.

You set flags for the optimizer using -Fo options (i.e. -Fo NAME=VALUE).

This lets you do a whole lot with a single command like this:

guild run train
  lr=loguniform[0.0001:0.1] 
  dropout=[0.1,0.2,0.3]
  -o forest 
  -Fo xi=0.1 
  -Fo random-starts=5
  --max-trials 100

That little -o forest invokes a specific optimizer. Because Guild uses standard operations for these optimizers, you can create your own without a ton of work. See this example which uses Hyperopt TPE.

mislav · July 22, 2020, 4:58pm

I was thinking about flags that are not directly connected with the model, but more with preprocessing step. Labeling is about making lables (1,0) from X.

I understand it now. Thanks !

Topic		Replies	Views
Optimizers Reference Reference	0	2100	June 7, 2020
Sequential optimization - how to tell guild which function to minimize? General	3	388	May 20, 2022
Guild AI and TensorFlow 2 Examples	0	2481	July 14, 2020
New feature, models General	3	762	September 30, 2020
NameError: guild doesnt recognize defined variable? Troubleshooting	9	1714	July 23, 2020

Forest optimizer example

Related topics