Looking for CUDA functions

Hello, I have installed opencv 4.8.0 with cuda support and python bindings using yay (arch linux). I am able to upload and download arrays to my gtx 1080 using the opencv cuda api (as shown here Getting Started with OpenCV CUDA Module), however i can’t find cv2.cuda.cvtColor, nor resize, warpAffine etc. Any ideas?

Here are the contents of cv2.cuda:

BufferPool DEVICE_INFO_COMPUTE_MODE_DEFAULT DEVICE_INFO_COMPUTE_MODE_EXCLUSIVE DEVICE_INFO_COMPUTE_MODE_EXCLUSIVE_PROCESS DEVICE_INFO_COMPUTE_MODE_PROHIBITED DYNAMIC_PARALLELISM DeviceInfo DeviceInfo_ComputeModeDefault DeviceInfo_ComputeModeExclusive DeviceInfo_ComputeModeExclusiveProcess DeviceInfo_ComputeModeProhibited EVENT_BLOCKING_SYNC EVENT_DEFAULT EVENT_DISABLE_TIMING EVENT_INTERPROCESS Event Event_BLOCKING_SYNC Event_DEFAULT Event_DISABLE_TIMING Event_INTERPROCESS Event_elapsedTime FEATURE_SET_COMPUTE_10 FEATURE_SET_COMPUTE_11 FEATURE_SET_COMPUTE_12 FEATURE_SET_COMPUTE_13 FEATURE_SET_COMPUTE_20 FEATURE_SET_COMPUTE_21 FEATURE_SET_COMPUTE_30 FEATURE_SET_COMPUTE_32 FEATURE_SET_COMPUTE_35 FEATURE_SET_COMPUTE_50 GLOBAL_ATOMICS GpuData GpuMat GpuMatND GpuMat_defaultAllocator GpuMat_setDefaultAllocator HOST_MEM_PAGE_LOCKED HOST_MEM_SHARED HOST_MEM_WRITE_COMBINED HostMem HostMem_PAGE_LOCKED HostMem_SHARED HostMem_WRITE_COMBINED NATIVE_DOUBLE SHARED_ATOMICS SURF_CUDA SURF_CUDA_ANGLE_ROW SURF_CUDA_HESSIAN_ROW SURF_CUDA_LAPLACIAN_ROW SURF_CUDA_OCTAVE_ROW SURF_CUDA_ROWS_COUNT SURF_CUDA_SIZE_ROW SURF_CUDA_X_ROW SURF_CUDA_Y_ROW SURF_CUDA_create Stream Stream_Null TargetArchs TargetArchs_has TargetArchs_hasBin TargetArchs_hasEqualOrGreater TargetArchs_hasEqualOrGreaterBin TargetArchs_hasEqualOrGreaterPtx TargetArchs_hasEqualOrLessPtx TargetArchs_hasPtx WARP_SHUFFLE_FUNCTIONS __doc__ __loader__ __name__ __package__ __spec__ createContinuous createGpuMatFromCudaMemory ensureSizeIsEnough fastNlMeansDenoising fastNlMeansDenoisingColored getCudaEnabledDeviceCount getDevice nonLocalMeans printCudaDeviceInfo printShortCudaDeviceInfo registerPageLocked resetDevice setBufferPoolConfig setBufferPoolUsage setDevice unregisterPageLocked wrapStream

Ah, it seems that e.g. warpAffine is provided by opencv_contrib (see e.g. https://github.com/opencv/opencv_contrib/tree/4.x/modules/cudawarping). Hmm… the only way to get the combination of arch+cuda+python+opencv+contrib seems to be to build it myself, following e.g. How to enable NVIDIA CUDA with OpenCV in Arch Linux - Jeremy's Programming Blog. Will try this now!

I’m getting this sort of error now

box% cmake ../opencv
CMake Error at CMakeLists.txt:11 (message):
  

  FATAL: In-source builds are not allowed.

         You should create a separate directory for build files.



-- Configuring incomplete, errors occurred!

(cwd is an empty build directory)

Oh, I had to remove CMakeCache.txt.

On the other hand arch’s PKGBUILD already contains cuda and contrib stuff…

# Maintainer: Antonio Rojas <arojas@archlinux.org>
# Contributor: Ray Rashif <schiv@archlinux.org>
# Contributor: Tobias Powalowski <tpowa@archlinux.org>

pkgbase=opencv
pkgname=(opencv opencv-samples python-opencv opencv-cuda)
pkgver=4.8.0
pkgrel=3
pkgdesc='Open Source Computer Vision Library'
arch=(x86_64)
license=(BSD)
url='https://opencv.org/'
depends=(tbb openexr gst-plugins-base libdc1394 cblas lapack libgphoto2 openjpeg2 ffmpeg protobuf)
makedepends=(cmake python-numpy python-setuptools mesa eigen hdf5 lapacke qt6-5compat vtk glew ant java-environment
             pugixml openmpi cudnn fmt nlohmann-json)
optdepends=('opencv-samples: samples'
            'vtk: for the viz module'
            'glew: for the viz module'
            'qt6-base: for the HighGUI module'
            'hdf5: for the HDF5 module'
            'opencl-icd-loader: For coding with OpenCL'
            'java-runtime: Java interface')
source=(https://github.com/opencv/opencv/archive/$pkgver/$pkgname-$pkgver.tar.gz
        https://github.com/opencv/opencv_contrib/archive/$pkgver/opencv_contrib-$pkgver.tar.gz
        vtk9.patch
        cuda-12.2.patch)
sha256sums=('cbf47ecc336d2bff36b0dcd7d6c179a9bb59e805136af6b9670ca944aef889bd'
            'b4aef0f25a22edcd7305df830fa926ca304ea9db65de6ccd02f6cfa5f3357dbb'
            'f35a2d4ea0d6212c7798659e59eda2cb0b5bc858360f7ce9c696c77d3029668e'
            '2acacd8df0fab431aa2197304c4496f3e4d8a8de9305994a6474e4c66dc3a159')

prepare() {
  patch -d $pkgname-$pkgver -p1 < vtk9.patch # Don't require all vtk optdepends
  patch -d $pkgname-$pkgver -p1 < cuda-12.2.patch # Fix build with CUDA 12.2
}

build() {
  export JAVA_HOME="/usr/lib/jvm/default"
  # cmake's FindLAPACK doesn't add cblas to LAPACK_LIBRARIES, so we need to specify them manually
  _opts="-DWITH_OPENCL=ON \
         -DWITH_OPENGL=ON \
         -DOpenGL_GL_PREFERENCE=LEGACY \
         -DCMAKE_CXX_STANDARD=17 \
         -DWITH_TBB=ON \
         -DWITH_VULKAN=ON \
         -DWITH_QT=ON \
         -DBUILD_TESTS=OFF \
         -DBUILD_PERF_TESTS=OFF \
         -DBUILD_EXAMPLES=ON \
         -DBUILD_PROTOBUF=OFF \
         -DPROTOBUF_UPDATE_FILES=ON \
         -DINSTALL_C_EXAMPLES=ON \
         -DINSTALL_PYTHON_EXAMPLES=ON \
         -DCMAKE_INSTALL_PREFIX=/usr \
         -DCPU_BASELINE_DISABLE=SSE3 \
         -DCPU_BASELINE_REQUIRE=SSE2 \
         -DOPENCV_EXTRA_MODULES_PATH=$srcdir/opencv_contrib-$pkgver/modules \
         -DOPENCV_SKIP_PYTHON_LOADER=ON \
         -DLAPACK_LIBRARIES=/usr/lib/liblapack.so;/usr/lib/libblas.so;/usr/lib/libcblas.so \
         -DLAPACK_CBLAS_H=/usr/include/cblas.h \
         -DLAPACK_LAPACKE_H=/usr/include/lapacke.h \
         -DOPENCV_GENERATE_PKGCONFIG=ON \
         -DOPENCV_ENABLE_NONFREE=ON \
         -DOPENCV_JNI_INSTALL_PATH=lib \
         -DOPENCV_GENERATE_SETUPVARS=OFF \
         -DEIGEN_INCLUDE_PATH=/usr/include/eigen3 \
         -DCMAKE_FIND_PACKAGE_PREFER_CONFIG=ON \
         -Dprotobuf_MODULE_COMPATIBLE=ON"
 
  cmake -B build -S $pkgname-$pkgver $_opts \
    -DBUILD_WITH_DEBUG_INFO=ON
  cmake --build build

  CFLAGS="${CFLAGS} -fno-lto" CXXFLAGS="${CXXFLAGS} -fno-lto" LDFLAGS="${LDFLAGS} -fno-lto" \
  cmake -B build-cuda -S $pkgname-$pkgver $_opts \
    -DBUILD_WITH_DEBUG_INFO=OFF \
    -DWITH_CUDA=ON \
    -DWITH_CUDNN=ON \
    -DCMAKE_C_COMPILER=gcc-12 \
    -DCMAKE_CXX_COMPILER=g++-12 \
    -DCUDA_ARCH_BIN='52-real;53-real;60-real;61-real;62-real;70-real;72-real;75-real;80-real;86-real;87-real;89-real;90-real;90-virtual' \
    -DCUDA_ARCH_PTX='90-virtual'
  cmake --build build-cuda
}

package_opencv() {
  DESTDIR="$pkgdir" cmake --install build

  # install license file
  install -Dm644 $pkgbase-$pkgver/LICENSE -t "$pkgdir"/usr/share/licenses/$pkgname

  # separate samples package
  mv "$pkgdir"/usr/share/opencv4/samples "$srcdir"

  # Add java symlinks expected by some binary blobs
  ln -sr "$pkgdir"/usr/share/java/{opencv4/opencv-${pkgver//./},opencv}.jar
  ln -sr "$pkgdir"/usr/lib/{libopencv_java${pkgver//./},libopencv_java}.so

  # Split Python bindings
  rm -r "$pkgdir"/usr/lib/python3*
}

package_opencv-samples() {
  pkgdesc+=' (samples)'
  depends=(opencv)
  unset optdepends

  mkdir -p "$pkgdir"/usr/share/opencv4
  mv samples "$pkgdir"/usr/share/opencv4

  # install license file
  install -Dm644 $pkgbase-$pkgver/LICENSE -t "$pkgdir"/usr/share/licenses/$pkgname
}

package_python-opencv() {
  pkgdesc='Python bindings for OpenCV'
  depends=(python-numpy opencv vtk glew qt6-base hdf5 jsoncpp openmpi pugixml fmt)
  unset optdepends

  DESTDIR="$pkgdir" cmake --install build/modules/python3

  # install license file
  install -Dm644 $pkgbase-$pkgver/LICENSE -t "$pkgdir"/usr/share/licenses/$pkgname
}

package_opencv-cuda() {
  pkgdesc+=' (with CUDA support)'
  depends+=(cudnn)
  conflicts=(opencv)
  provides=(opencv=$pkgver)
  options=(!debug)

  DESTDIR="$pkgdir" cmake --install build-cuda

  # install license file
  install -Dm644 $pkgbase-$pkgver/LICENSE -t "$pkgdir"/usr/share/licenses/$pkgname

  # Split samples
  rm -r "$pkgdir"/usr/share/opencv4/samples

  # Add java symlinks expected by some binary blobs
  ln -sr "$pkgdir"/usr/share/java/{opencv4/opencv-${pkgver//./},opencv}.jar
  ln -sr "$pkgdir"/usr/lib/{libopencv_java${pkgver//./},libopencv_java}.so

  # Split Python bindings
  rm -r "$pkgdir"/usr/lib/python3*
}

Compilation errors whatever I do. Tried to use gcc 10 instead of 13, no luck…

Ok, so I had

  • general opencv compile errors about java jni something - needed to fix my java install (e.g. javac command was broken)
  • cuda-specific opencv compile errors like /usr/include/stdlib.h(141): error: identifier "_Float32" is undefined - needed sudo ln -s /usr/bin/g++-12 /opt/cuda/bin/g++ and similar for gcc since cuda apparently hasn’t caught up with gcc-13 yet (just like back in gcc-10 times!; also did export CC=gcc-12 and export CXX=g++-12 etc just to be sure)
  • ambiguous overload something - solved like this CUDA 12.2 fp16 dnn compilation error · Issue #23893 · opencv/opencv · GitHub
  • some ovis-specific error - needed to -D BUILD_opencv_ovis=OFF in my cmake call (beware! it is case sensitive)

just saying, - you can’t use CUDA from the java api

I also have some warnings during cmake which I simply ignore:

CMake Warning at samples/samples_utils.cmake:10 (add_executable):
  Cannot generate a safe runtime search path for target
  example_opencl_opencl-opencv-interop because files in some directories may
  conflict with libraries in implicit directories:

    runtime library [libOpenCL.so.1] in /usr/lib may be hidden by files in:
      /opt/cuda/lib64

  Some of these libraries may not be found correctly.
Call Stack (most recent call first):
  samples/opencl/CMakeLists.txt:33 (ocv_define_sample)

After following the rest of the steps at How to enable NVIDIA CUDA with OpenCV in Arch Linux - Jeremy's Programming Blog, I finally have cv2.cuda.warpAffine!