Logging sklearn's classification report

chefhose · February 12, 2021, 8:48am

I’d like to log sklearn’s classification reports with guild ai for each step.
One way is to reformat the classification report to print a value per line as a normal scalar, but this makes it unreadable for humans.
I guess there is a more "guildai-ish " way to do this?

garrett · February 12, 2021, 4:51pm

Hello and welcome! This question is so good it deserves it’s own permanent example!

Guild AI Classification Report Example

Your instinct is good—it’s an anti-pattern to modify your code to suit a tool like Guild. Guild should support your project, not the other way around!

The example shows how output scalars are captured using patterns from output. The configuration is one of the more advanced forms supported by Guild, but it shows the flexibility.

For more examples on how to configure output scalars, see Guild File Reference.

chefhose · February 15, 2021, 11:10am

Thanks, this works! But what if I have say 20 classes and I want to track them all, do I really need to add 20 lines (plus the one for micro, macro, weighted etc.) to the guild file?

garrett · February 15, 2021, 4:07pm

How are you testing the classes now? Do you run separate operations for each?

Scalar values are associated with runs. If you wanted to summarized multiple classes for a single run, you’d need to use multiple scalar names (e.g. class_1_recall, class_2_recall, etc.). That’s a much harder case as Guild’s output scalar scheme is line based—it does not support multi-line patterns. You’d need to modify the way you log output, which is I think what you’re trying to avoid based on your original post.

The other way is to run one operation per class—each class would be reflected in its own run. In that case the class would likely be implemented as a flag. Like this:

guild run op class=class_1

You could run all of the classes by listing them:

guild run op class=[class_1,class_2,...]

In that case you’d use the same output scalar patterns as used in the example. Each class would appear as a separate run in compare, lists, etc.

chefhose · February 15, 2021, 4:21pm

ok thanks. I’m working on a multilabel case, so separate runs are not an option. I will stick to the modified output then.

garrett · February 15, 2021, 4:40pm

How would you be inclined to report the output for each class? You want to capture metrics per class per step, yes?

Would you print a header for each class, along this line?

print("Class '%s' % class_name)

for step, data in _iter_data_for_class(...):
    print("Step %i" % step)
    print_sklean_output(data)

Topic		Replies	Views
Scalars not getting saved Troubleshooting	5	620	January 26, 2021
Logging non-numeric output variables General	12	1316	June 5, 2023
Get Started: Create a Guild File Get Started	0	5272	June 7, 2020
Scalars Concepts	0	4536	June 12, 2020
Suppress scalar output globbing from logger/stdout General	6	1251	September 22, 2020

Logging sklearn's classification report

Related topics