Critique from a data scientist made me rethink my model evaluation entirely

I was at a meetup in Portland last Tuesday and this older guy looked at my confusion matrix and said "you're just chasing accuracy, aren't you?" Totally called me out. I had been tweaking hyperparameters for weeks trying to get that number above 95% on my image classifier. He pointed out my recall on minority classes was terrible, like under 30%. Spent the weekend rebalancing the dataset and switching to F1 as my main metric. My accuracy dropped 3 points but the model actually works now on real data. Has anyone else gotten that kind of wake-up call from a stranger?

2 comments

2 Comments

sam5302d ago

Whoa that's a solid point about class balance actually! But I think the real hidden gem here is that even F1 can trick you if your minority classes are tiny - like a 1% class getting 50% recall still looks okay in F1 but your model is missing half of them. Precision recall curves showed me that stuff way better than any single number ever could.

coleman.jade3d ago

Ngl I kinda disagree with that take. Accuracy isn't always the enemy, especially if your classes are balanced and you're just trying to get a solid baseline. Swapping to F1 doesn't automatically fix things if your data's still messy.