H
28

Caught my A.I. training tool using the same test data for validation for 3 months

I was running my object detector loop again last week and noticed the accuracy numbers looked too good. Turns out I had a file path error in my Python script, so it was pulling validation images from the training folder instead. Felt like a total rookie when I spotted it in the terminal. Anyone else had a sneaky bug like that waste your time?
2 comments

Log in to join the discussion

Log In
2 Comments
iris_green84
Yeah, that "accuracy numbers looked too good" part hit hard. I had the same kind of bug where my training script was accidentally reading from the test set for validation. It took me like a week to figure out because the loss curves looked perfect but then the model flopped in production. The fix I use now is to print out the first few file paths from each dataloader at the start of every run. Just a simple check that takes five seconds but saves you from that embarrassing realization later. Also, hardcoding a small validation set that never changes is a solid backup plan for catching these mixups early.
9
ryan_carr59
Printing a batch of sample paths before each training run caught this exact mistake for me.
7