Hyperopt TPE vs Skopt Gaussian Processes

In developing the Hyperopt example I wanted to compare its performance to Scikit Optimize — specifically the gp optimizer.

I ran 50 trials for each optimizer, minimizing loss for the Get Started mock training script.

Scatterplot results for tpe:

Screenshot from 2020-07-10 18-54-25

Scatterplot results for gp:

Screenshot from 2020-07-10 18-53-32

Notice the concentration of trials around the minimum for gp. I would have expected this for tpe. I wonder if the example is implemented incorrectly.

Steps to reproduce

From the example dir generate 50 trials using tpe:

guild run train -o tpe -m50 x=[-2.0:2.0] -t tpe

Next generate 50 trials using gp:

guild run train -o gp -m50 x=[-2.0:2.0] -t gp

View the tpe trials in TensorBoard:

guild tensorboard -l tpe

Click HPARAMS and then the scatterplot tab. Deselect all flags and metrics except x and loss.

Do the same for the gp trials:

guild tensorboard -l gp