DNN GPU Broken CUDA Issues - Pls help

Hello, I was getting this error after running a python script trying to add gpu computing functionality on some opencv dnn code. I built opencv from source for gpu and it does seem to be recognizing my gpu but it is running on cpu and is giving me this error:

checkVersions CUDART version 11020 reported by cuDNN 8100 does not match with the version reported by CUDART 11000

I have done a lot of googling but I dont see any references to this error and am unable to parse what it means. I would be very greatful if someone could walk me through how to fix this error. I am using opencv 4.5 with cuda 11.1 and a laptop rtx 2060 with a compute capability of 7.5. My gpu does work with tensorflow and darknet.

welcome.

the error comes from

where did your opencv come from, an official release, some package from some package manager, or own build?

in any case, those versions apparently need to match but they donā€™t. do you have multiple versions of CUDA libraries installed?

Yes I did build it from source so that I could get opencv to use my gpu in python. I do not know how to make those versions match. I do not recall installing multiple versions of CUDA however it is possible. Where would you recommend I start to fix this issue?

ā€“
ā€“ NVIDIA CUDA: YES (ver 11.0, CUFFT CUBLAS FAST_MATH)
ā€“ NVIDIA GPU arch: 75
ā€“ NVIDIA PTX archs:

ā€“ cuDNN: YES (ver 8.1.0)

ā€“ OpenCL: YES (no extra features)
ā€“ Include path: /home/asd/opencv/3rdparty/include/opencl/1.2
ā€“ Link libraries: Dynamic load

ā€“ Python 3:
ā€“ Interpreter: /usr/bin/python3 (ver 3.8.6)
ā€“ Libraries: /usr/lib/x86_64-linux-gnu/libpython3.8.so (ver 3.8.6)
ā€“ numpy: /home/asd/.local/lib/python3.8/site-packages/numpy/core/include (ver 1.19.4)
ā€“ install path: lib/python3.8/dist-packages/cv2/python-3.8

ā€“ Python (for build): /usr/bin/python3
ā€“ Pylint: /home/asd/.local/bin/pylint (ver: 3.8.6, checks: 181)

ā€“ Java:
ā€“ ant: NO
ā€“ JNI: /usr/lib/jvm/default-java/include /usr/lib/jvm/default-java/include/linux /usr/lib/jvm/default-java/include
ā€“ Java wrappers: NO
ā€“ Java tests: NO

ā€“ Install to: /usr/local


It implies that cuDNN 8.1 requires CUDA 11.2 which according to the release notes is not true (10.2 and above should be supported according to the matrix).

My guess would be that this check is too strict cudnnGetCudartVersion(); is defined as

The same version of a given cuDNN library can be compiled against different NVIDIAĀ® CUDAĀ® Toolkitā„¢ versions. This routine returns the CUDA Toolkit version that the currently used cuDNN library has been compiled against.

so the check looks to be rejecting CUDA 11.0 because cuDNN 8.2 was built agains CUDA 11.2. That said I think this would have been reported somewhere already if its true.

Anyway easy test would be to install cuDNN 8.0.0-8.0.3 as these should have been built against CUDA 11.0. You shouldnā€™t need to rebuild as cuDNN is dynamically linked.

If I do that, I believe that it will make my TensorFlow not work? Iā€™m kinda paranoid cuz the TensorFlow installation was not enjoyable and I want that to continue to work. If I install 8.0.3, it only works with CUDA 10.2? I have installed cuda 11. Can I install both simulataneously?

1 Like

8.0.3 should work with CUDA 11.0

You have several options but the easiest thing to do is check that this is the problem first. Canā€™t remember how to install cuDNN on linux, but from memory you just need to set two simlinks. If thatā€™s it you can test to see if this is the problem.

If it is you should have 3 options

  1. use the new version of cuDNN if it works with tensorflow,
  2. use cuDNN 8.1.0 and install CUDA 11.2 and re-compile OpenCV (or do the same with CUDA 11.1 if a previous version of cuDNN is suitable for tensorflow), or
  3. comment out the check crackwitz pointed you to and re-compile.
1 Like

[WARN:0] global /home/aoberai/opencv/modules/dnn/src/cuda4dnn/init.hpp (34) checkVersions cuDNN reports version 8003 which does not match with the version 8100 with which OpenCV was built
2.0731723958091663

Different error now with cudnn 8.0.3. Im assuming I need to rebuild opencv?

Thatā€™s unfortunate, looks like you will have to build OpenCV again. Fingers crossed it doesnā€™t need to build too much.

looks like thereā€™s already an issue for this (I might be wrong). it is fairly new.

I have got similar message
[ WARN:0] global g:\lib\opencv\modules\dnn\src\cuda4dnn/init.hpp (42) cv::dnn::cuda4dnn::checkVersions CUDART version 11010 reported by cuDNN 8005 does not match with the version reported by CUDART 11020

but itā€™s a warning. My GPU is used so

  NVIDIA CUDA:                   YES (ver 11.2, CUFFT CUBLAS)
    NVIDIA GPU arch:             86
    NVIDIA PTX archs:

  cuDNN:                         YES (ver 8.0.5)

https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnGetCudartVersion

The same version of a given cuDNN library can be compiled against different NVIDIAĀ® CUDAĀ® Toolkitā„¢ versions. This routine returns the CUDA Toolkit version that the currently used cuDNN library has been compiled against.

I think this warning is useless isnā€™t it?

I agree, itā€™s only a warning, and it might be overly cautious to warn about these situations.

the original issue was that OPā€™s code doesnā€™t use the GPUā€¦ so I wonder, does OpenCV even see the GPU at runtime, let alone pick or be forced to use it? I am unfamiliar with the cuda modules, cuda4dnn in particular. how can that be determined?

1 Like

Rebuilding after changing cudnn version worked. It does appear to be running on gpu however it is rather slow. I should be able to run yolo on opencv dnn at around 20 fps but i am getting 2. I have no idea why.

There are three versions of CUDART in play here:

  1. CUDART version that is used to build cuDNN binaries
  2. CUDART version that is used to build the OpenCV
  3. CUDART version that is used at runtime

The three need not agree always. OpenCV DNN gives a few warnings when they donā€™t match.

This warning is informing us that the cuDNN binaries were built with CUDA 11.2 but the CUDART version that is installed on your system is CUDA 11.0.

The check exists because cuDNN downloads page used to have separate binaries for different minor versions of CUDART.

The warning just informs the end-user about the version mismatch but takes no special action that could prevent the GPU from being used.

The first forward pass is generally very slow since a lot of initialization takes place. Ignore the first forward pass while benchmarking.

There is also a known performance regression. You need cuDNN 7 for best performance.

Please check if you have set the backend and target to DNN_BACKEND_CUDA and DNN_TARGET_CUDA (or DNN_TARGET_CUDA_FP16) as shown in this example script


I think the paragraph quoted is informing that the same version of cuDNN ā€œsourceā€ can be compiled with different versions of CUDA Toolkit. cuDNN used to have different releases for different minor versions. So Iā€™m not sure if the check is too strict. But there is something new since CUDA 11.1:

First introduced in CUDA 11.1, CUDA Enhanced Compatibility provides two benefits:

  • By leveraging semantic versioning across components in the CUDA Toolkit, an application can be built for one CUDA minor release (such as 11.1) and work across all future minor releases within the major family (such as 11.x).

But this still doesnā€™t explain whatā€™s happening. cuDNN built for CUDA 11.2 need not be compatible with CUDA 11.0?

But this is incorrect since CUDA 11.1. The check needs to be updated.

If the user requested CUDA backend for inference, warnings will be issued when no GPU device was detected or the selected device is incompatible.

1 Like