Hello,
The “ufacedetect.cpp” from T-API folder in samples folder runs super slow after compilation. When running the compiled code, OpenCL can be seen displayed as “ON” while running the compiled code. There is no difference in performance, OpenCL does not offload CPU usage.
System information (version)
opencv-4.5.5_9
OS: FreeBSD 13.1-RELEASE-p1 amd64
Resolution: 3840x2160
DE: Plasma 5.24.6
WM: KWin
Theme: [Plasma], Breeze [GTK2/3]
Icons: [Plasma], breeze-dark [GTK2/3]
Terminal: konsole
CPU: AMD FX-8350 (8) @ 3.991GHz
GPU: Ellesmere [Radeon RX 580]
Memory: 11708MiB / 32684MiB
Compiler => FreeBSD clang version 13.0.0 (git@github.com:llvm/llvm-project.git llvmorg-13.0.0-0-gd7b669b3a303)
Here is the output for clinfo
:
here is the output for opencv_version --opencl
:
4.5.5
OpenCL Platforms:
Clover
dGPU: AMD Radeon RX 580 Series (POLARIS10, DRM 3.35.0, 13.1-RELEASE-p1, LLVM 13.0.1) (OpenCL 1.1 Mesa 21.3.8)
Portable Computing Language
CPU: AMD FX(tm)-8350 Eight-Core Processor (OpenCL 1.2 pocl HSTR: pthread-x86_64-portbld-freebsd13.1-bdver2)
Current OpenCL device:
Type = dGPU
Name = AMD Radeon RX 580 Series (POLARIS10, DRM 3.35.0, 13.1-RELEASE-p1, LLVM 13.0.1)
Version = OpenCL 1.1 Mesa 21.3.8
Driver version = 21.3.8
Address bits = 64
Compute units = 36
Max work group size = 256
Local memory size = 32 KB
Max memory allocation size = 3 GB 204 MB 819 KB 204 B
Double support = Yes
Half support = Yes
Host unified memory = No
Device extensions:
cl_khr_byte_addressable_store
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_fp64
cl_khr_extended_versioning
Has AMD Blas = No
Has AMD Fft = No
Preferred vector width char = 16
Preferred vector width short = 8
Preferred vector width int = 4
Preferred vector width long = 2
Preferred vector width float = 4
Preferred vector width double = 2
Preferred vector width half = 0
Thanks for any help.