Net.forward repeating results for different blobs

Hi, I’m facing an odd situation where after processing a few frames, the net.forward() pass will start returning the same exact results despite that I’m feeding it different blobs every time. I haven’t been able to find the root of this yet, but at least I’ve found out that net.forward() is the one causing the problem. Some side notes:

  1. it only happens when using cuda enabled version of cv compiled from source with CUDA and CUDNN enabled. CUDA version is 11.2 and CUDNN version 8.1.0. I picked these versions because I have the very same version of the software running in an identical machine somewhere else, and I’m not seeing the problem there.

  2. this odd behavior does NOT happen when using python-opencv installed via pip … it will always work as expected.

  3. the net has been loaded with cv2.dnn.readFromDarknet and works fine for a second or two, then net.forward starts returning always the same values.

It would seem as if the net.setInput(blob) failed … that could probably explain why I’m getting always the same outputs for the net.forward. Now, is this even possible? For the net.setInput() call to fail? Out of mem or something? Just thinking out loud, I’m not even sure that’s what’s really happening.

Do you have any other possible explanation as to why net.forward would start behaving like this? (to always give the same result?)

I will keep digging nonetheless, but any help will be very much appreciated.

Thank you.
Best regards.

sounds like a bug in OpenCV’s dnn+cuda module.

best to open an issue on the github. prepare a minimal reproducible example (code) for the bug report. if this can be reproduced with opencv’s own examples, even better.

have you been able to make this work correctly on any other installation where cuda is enabled?

Hi, yes, the very same project runs without presenting any problem on an identical machine: same motherboard, same cpu, same amount of memory, both with ubuntu 20.04, cuda 11.2, and cudnn 8.1.0, opencv compiled to use cuda+cudnn on both of them (same versions), with the same compute capabilites compiled (6.1 and 8.6). The only difference that I can think of is that the machine where it works has two GPUs (a 1080 ti and a 3090) versus a single 1080 ti on the one that’s presenting the issue.

I spent days debugging the code assuming that I had messed something up with my pipeline, up to the point where I hash and save to disk every frame that I pass to the net.setInput() and I can confirm that I’m passing new fresh frames to it, only to get always the same exact result from net.forward(). Yes, I agree that it would seem like a bug.

I’ll report the issue on github, hopefully someone will pick it up.
Thank you.

I will attach both cv2.getBuildInformation() to the issue

2 Likes