Custom visualizations between runs

One of the greatest features in guild is the ability to so easily compare different runs using all the great visualization tools. This works great for comparing run metrics.

I work in robotics and computer vision and I usually need more advanced custom visualizations to describe performance of a run.

For an example, in robotics pose estimation you want a 3D line plot comparing the estimated pose trajectory compared to the ground truth trajectory. Currently in guild you can easily compare the RMSE between runs, but not this example plot.

I have a couple of ideas to implement this myself, but I wanted to put it out there to see if other had some good ideas.

My first idea is to use the guild.ipy Python interface in a stand-alone jupyter notebook (i.e. indepedent of guilds notebook feature). The notebook will be parameterized to take as input any number of guild run-id and then create the comparison plots.

Another idea is again to use guild.ipy together with streamlit to create a more interactive experience.

These ideas are unfortunately quite separeted from the rest of the guild environment. I was wondering if there was a way to add a plugin to the guild view command. In this way the user could write up his own comparison tools and that could be integrated into guild view just as view in TensorBoard currently is. For an example, maybe guild could support a notebook-plugin. So the notebook will be executed per run and the associated HTML notebook could be displayed in guild view. Then you would have multiple HTML notebook for each run and could compare them in guild view.

Open for any input and welcoming discussion!

A few thoughts…

Longer term (later this year) we are looking to add this level of customization directly to Guild View. The intended function would support custom comparison widgets that could be applied across runs. This would be similar to TensorBoard’s plugin architecture, but quite a bit simpler to implement.

Shorter term, there are two approaches that I can imagine:

  • External diff programs (limit to comparing two runs side by side)
  • Generate images and use TensorBoard’s images plugin (can compare any number of runs side-by-side)

As an example of diff with Notebooks, see:

The “Diff Notebooks” section shows how the external program nbdiff is used by default to compare notebooks for a run side-by-side. But any program can be used here. I think this might be the best tool to visually compare two runs, provided you generate ipynb files.

It’s not clear to me how guild.ipy fits into this scheme though. Are you using the run function in that module to create a copy of the current notebook, or a different notebook? How is Guild’s notebook support from the CLI not working in this case?

guild.ipy could potentially include a new interface for starting and stopping a run, but this is dangerous territory as it opens up a set of practices are that uncontrolled and subject to all the problems of tracking ad hoc work in a notebook.

Thank you for this @garrett - exicting news with the custom widgets in Guild View!

I had completely forgot about the guild diff command and I think it might serve my purpose for now. Logging in Tensorboard could also make sense as a temporary workaround.

I made a little example to show I meant by using the guild.ipy module.

GUILD_HOME = "..."
import guild.ipy as guild
import argparse

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--guild_runs", type=str, nargs="+")
    args = parser.parse_args()

    guild.set_guild_home(GUILD_HOME)

    guild_runs = guild.runs()
    filtered_runs = guild_runs[guild_runs.run.isin(args.guild_runs)]

    # Create you custom visualizations here..
    plot(filtered_runs)

I would just run this script independent of guild and have access to all the plotting functionality I want in Python. I hope this clarifies.

Ah, I see. So this is even something you can run in a notebook, for the sake of your own custom visualization and comparison.

This pattern fits into what I call a “summary operation” — an operation that is applied to existing runs for some analysis after the fact. This lets you perform whatever processing you want, generate whatever plots you want, and then have a record of that analysis in the form of a Guild run.

Now that you spell it this way, I can see that this is similar to a customized UI in Guild View. I think the difference is that Guild View (the future version) lets you build a comparison UI once and then apply it to various selected runs in a more exploratory/ad hoc fashion.

Yes, a “summary operation” is a good way to put it.

And my other idea was to essentially do the same using Streamlit, where you would have a way of selecting the runs you want to compare more interactively. It seems like that is going to resemble what eventually will be possible in Guild View.

Thank you for your input! I will probable proceed with the “summary operation” way and eagerly await the Guild View features.

1 Like