since “potato” is still a sizable fraction of the picture, restricting to those regions will probably gain you just that fraction, and the extra work specifying/obeying the regions may take up the time saved.
depends on what does “the search”. you can take a bounding box for each potato and work on the submatrix of that bounding box. that’s probably the least overhead. some functions also take mask arguments.
if you’re only going to threshold for dark spots, you could pick that threshold relative to the average color/brightness of the visible potato(es) and run that on the entire picture, then take intersection (for masks, bitwise and equals logical and) as needed.