Logging sklearn's classification report

I’d like to log sklearn’s classification reports with guild ai for each step.
One way is to reformat the classification report to print a value per line as a normal scalar, but this makes it unreadable for humans.
I guess there is a more "guildai-ish " way to do this?

Hello and welcome! This question is so good it deserves it’s own permanent example!

Guild AI Classification Report Example

Your instinct is good—it’s an anti-pattern to modify your code to suit a tool like Guild. Guild should support your project, not the other way around!

The example shows how output scalars are captured using patterns from output. The configuration is one of the more advanced forms supported by Guild, but it shows the flexibility.

For more examples on how to configure output scalars, see Guild File Reference.

1 Like

Thanks, this works! But what if I have say 20 classes and I want to track them all, do I really need to add 20 lines (plus the one for micro, macro, weighted etc.) to the guild file?

How are you testing the classes now? Do you run separate operations for each?

Scalar values are associated with runs. If you wanted to summarized multiple classes for a single run, you’d need to use multiple scalar names (e.g. class_1_recall, class_2_recall, etc.). That’s a much harder case as Guild’s output scalar scheme is line based—it does not support multi-line patterns. You’d need to modify the way you log output, which is I think what you’re trying to avoid based on your original post.

The other way is to run one operation per class—each class would be reflected in its own run. In that case the class would likely be implemented as a flag. Like this:

guild run op class=class_1

You could run all of the classes by listing them:

guild run op class=[class_1,class_2,...]

In that case you’d use the same output scalar patterns as used in the example. Each class would appear as a separate run in compare, lists, etc.

ok thanks. I’m working on a multilabel case, so separate runs are not an option. I will stick to the modified output then.

How would you be inclined to report the output for each class? You want to capture metrics per class per step, yes?

Would you print a header for each class, along this line?

print("Class '%s' % class_name)

for step, data in _iter_data_for_class(...):
    print("Step %i" % step)