Open source and experiment tracking

Guild goes out of its way to let you track experiments without changing your code. There are a couple reasons for all the fuss:

  • It keeps your code independent of the experiment tracking scheme

Okay there’s just one reason.

Why is this important?

When you embed a dependency in your code, your code carries that dependency wherever it goes. Everyone must satisfy it. If the requirements are minimal, no big deal. But what if they’re not?

Consider what many experiment tracking tools require:

  • Databases
  • Distributed file systems
  • Network connectivity to back-end systems
  • Authorization credentials

You want to “just run” your code? Not so fast.

This is not only questionable design — it violates the principle of separation of concerns — it undermines the principles of open source software.

Richard Stallman:

Creativity can be a social contribution, but only in so far as society is free to use the results.

This is never more true than in machine learning. If you can’t run a piece of software because it’s tied to systems you can’t access, you can’t use it.

While you’re code may be open source, it’s stymied. People are free to study your code, but not to run it.

Where’s the experimentation in that?

1 Like