Debug opencv_imgcodecs490d.dll is very slow

I updated my project from opencv 3.4.3 to 4.9.0 and noticed that some functions (cv:Sobel()) became slower in debug mode. I release mode it works well. See for details:

opencv 4.9.0


CMake configuration for 4.9.0:

General configuration for OpenCV 4.9.0 =====================================
Version control: unknown

Extra modules:
Location (extra): C:/Users/kvy/Downloads/opencv_contrib-4.x/modules
Version control (extra): unknown

Platform:
Timestamp: 2024-03-15T07:47:27Z
Host: Windows 10.0.19045 AMD64
CMake: 3.29.0-rc3
CMake generator: Visual Studio 17 2022
CMake build tool: C:/Program Files/Microsoft Visual Studio/2022/Community/MSBuild/Current/Bin/amd64/MSBuild.exe
MSVC: 1939
Configuration: Debug Release

CPU/HW features:
Baseline: SSE SSE2
requested: SSE2
Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX
requested: SSE4_1 SSE4_2 AVX FP16
SSE4_1 (19 files): + SSE3 SSSE3 SSE4_1
SSE4_2 (2 files): + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2
FP16 (1 files): + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
AVX (9 files): + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2 AVX

C/C++:
Built as dynamic libs?: YES
C++ standard: 11
C++ Compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.39.33519/bin/Hostx64/x86/cl.exe (ver 19.39.33522.0)
C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /arch:SSE /arch:SSE2 /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /O2 /Ob2 /DNDEBUG
C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /arch:SSE /arch:SSE2 /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /Zi /Ob0 /Od /RTC1
C Compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.39.33519/bin/Hostx64/x86/cl.exe
C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /arch:SSE /arch:SSE2 /MP /O2 /Ob2 /DNDEBUG
C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /arch:SSE /arch:SSE2 /MP /Zi /Ob0 /Od /RTC1
Linker flags (Release): /machine:X86 /INCREMENTAL:NO
Linker flags (Debug): /machine:X86 /debug /INCREMENTAL
ccache: NO
Precompiled headers: YES
Extra dependencies:
3rdparty dependencies:

OpenCV modules:
To be built: aruco bgsegm bioinspired calib3d ccalib core datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot quality rapid reg rgbd saliency shape signal stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto
Disabled: js_bindings_generator python_bindings_generator python_tests world
Disabled by dependency: -
Unavailable: alphamat cannops cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev cvv freetype hdf java julia matlab ovis python2 python3 sfm viz
Applications: tests perf_tests apps
Documentation: NO
Non-free algorithms: NO

Windows RT support: NO

GUI: WIN32UI
Win32 UI: YES
VTK support: NO

Media I/O:
ZLib: build (ver 1.3)
JPEG: build-libjpeg-turbo (ver 2.1.3-62)
SIMD Support Request: YES
SIMD Support: YES
WEBP: build (ver encoder: 0x020f)
PNG: build (ver 1.6.37)
TIFF: build (ver 42 - 4.2.0)
JPEG 2000: build (ver 2.5.0)
OpenEXR: build (ver 2.3.0)
HDR: YES
SUNRASTER: YES
PXM: YES
PFM: YES

Video I/O:
DC1394: NO
FFMPEG: YES (prebuilt binaries)
avcodec: YES (58.134.100)
avformat: YES (58.76.100)
avutil: YES (56.70.100)
swscale: YES (5.9.100)
avresample: YES (4.0.0)
GStreamer: NO
DirectShow: YES
Media Foundation: YES
DXVA: YES

Parallel framework: Concurrency

Trace: YES (with Intel ITT)

Other third-party libraries:
Intel IPP: 2021.11.0 [2021.11.0]
at: C:/Users/kvy/Downloads/opencv-4.9.0/build/3rdparty/ippicv/ippicv_win/icv
Intel IPP IW: sources (2021.11.0)
at: C:/Users/kvy/Downloads/opencv-4.9.0/build/3rdparty/ippicv/ippicv_win/iw
Lapack: YES (C:/Users/kvy/Downloads/OpenBLAS-0.3.26-x86/lib/libopenblas.lib)
Eigen: NO
Custom HAL: NO
Protobuf: build (3.19.1)
Flatbuffers: builtin/3rdparty (23.5.9)

OpenCL: YES (NVD3D11)
Include path: C:/Users/kvy/Downloads/opencv-4.9.0/3rdparty/include/opencl/1.2
Link libraries: Dynamic load

Python (for build): NO

Install to: C:/Users/kvy/Downloads/opencv-4.9.0/build/install

Any ideas

how much slower? can you quantify that with measurements and comparison?

See the second sreenshot for 3.4.3


CPU usage for cv::Sobel: 32.2% (4.9.0) and 2.24% (3.4.3)

What do you understand about the practical differences between debug and release builds?

I have no problems with release versions. Here we are talking about two debug versions. The old debug version 3.4.3 is 15 times faster than the new 4.9.0. This is not normal and I would like to know why. For work I need that the new version works how the old in debug. Is it possible?

Assuming you just downloaded the binary packages, you might have to build OpenCV yourself to fit your requirements.

You can compare the return value of cv::getBuildInformation() for each binary package (3.4 and 4.x) and each build flavor (debug/release) that you have.

to be sure: I’m talking about the settings used to build the library, not the settings used to build your application and link against OpenCV.

I built these two packages myself. You can find the 4.9.0 build configuration in my first post.
Thank you for the function, I did not know about it. Now I can get the configuration for 3.4.3 too and compare them.

I’m working on it now. I compared two build configurations (3.4.3. and 4.9.0) and found some differences in CMAKE_CXX_FLAGS and CMAKE_C_FLAGS:

  • /MDd (no in 4.9.0)
  • /fp:fast (in 3.4.3) and /fp:precise (in 4.9.0).

I tried to move them into 4.9.0, but this did not work. Perhaps, the problem is in C++11: opencv 4.x.x fits full this standard. But 3.4.3 is only partial.
I have a temporary solution: change /Ob0 to /Ob1 (https://learn.microsoft.com/en-us/cpp/build/reference/ob-inline-function-expansion?view=msvc-170) for debug build:
CMAKE_CXX_FLAGS_DEBUG ...../Ob1
CMAKE_C_FLAGS_DEBUG ...../Ob1
This will double the performance, but will remove some debugging information. For example, my program (in debug) with 3.4.3 has FPS=30. After upgrade to 4.9.0: FPS=6. If you replace /Ob0 to /Ob1, FPS = 20.
There are no other ideas yet.

My final solution: build opencv using the following parameters in CMAKE_C_FLAGS_DEBUG и CMAKE_CXX_FLAGS_DEBUG:

/Zi /Ob2 /O2 /MD