Inverse DFT slowdown with VS Cmake projects

Hello! I’ve created a c++ project with VS 2022. It worked fine. But when i switched it to VS Cmake project (recreated) noticed significant speed dropdown in a particular function call, namely,
cv::dft(my_input_mat, my_output_mat, DFT_INVERSE | DFT_SCALE | DFT_REAL_OUTPUT);
Before it took about 3ms to process roughly 400x400 matrix (debug mode), Cmake takes about 185ms.
I am running it in a “for” cycle, so this time difference very counts.
Before moving to Cmake I’ve been linking OpenCV to the project with next project settings:

  • adding \opencv\build\include to additional directories;
  • adding opencv\build\x64\vc15\lib to additional library directories;
  • and adding opencv_world(version).dll to additional dependencies.
    As for Cmake, I’ve added OpenCV into project’s CMakeLists.txt in following manner:
find_package(OpenCV CONFIG REQUIRED)
include_directories( ${OpenCV_INCLUDE_DIRS} )

target_link_libraries(my_project PRIVATE opencv_ml opencv_dnn opencv_core opencv_flann)
target_link_libraries(my_project PRIVATE ${OpenCV_LIBS})

Maybe I should install OpenCV with some additional settings in CMakeLists or something?
Any help will be much appreciated!

You have to insert in both configuration at your first line of your code :
cout<<cv::getBuildInformation()<<"\n";

and post both results

i notice, that #1 is linking to an opencv_world.lib (prebuilt ?), while #2 is linking opencv_ml opencv_dnn opencv_core, etc.

Thank You for answering!
Here are configurations provided by cv::getBuildInformation().

VS simple console project:

General configuration for OpenCV 4.6.0 =====================================
  Version control:               4.6.0

  Platform:
    Timestamp:                   2022-06-05T15:53:27Z
    Host:                        Windows 10.0.16299 AMD64
    CMake:                       3.12.18081601-MSVC_2
    CMake generator:             Visual Studio 15 2017
    CMake build tool:            C:/Program Files (x86)/Microsoft Visual Studio/2017/BuildTools/MSBuild/15.0/Bin/MSBuild.exe
    MSVC:                        1916
    Configuration:               Debug Release

  CPU/HW features:
    Baseline:                    SSE SSE2 SSE3
      requested:                 SSE3
    Dispatched code generation:  SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
      requested:                 SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
      SSE4_1 (16 files):         + SSSE3 SSE4_1
      SSE4_2 (1 files):          + SSSE3 SSE4_1 POPCNT SSE4_2
      FP16 (0 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
      AVX (4 files):             + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
      AVX2 (31 files):           + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
      AVX512_SKX (5 files):      + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX

  C/C++:
    Built as dynamic libs?:      YES
    C++ standard:                11
    C++ Compiler:                C:/Program Files (x86)/Microsoft Visual Studio/2017/BuildTools/VC/Tools/MSVC/14.16.27023/bin/Hostx86/x64/cl.exe  (ver 19.16.27048.0)
    C++ flags (Release):         /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP3  /MD /O2 /Ob2 /DNDEBUG
    C++ flags (Debug):           /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP3  /MDd /Zi /Ob0 /Od /RTC1
    C Compiler:                  C:/Program Files (x86)/Microsoft Visual Studio/2017/BuildTools/VC/Tools/MSVC/14.16.27023/bin/Hostx86/x64/cl.exe
    C flags (Release):           /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /MP3   /MD /O2 /Ob2 /DNDEBUG
    C flags (Debug):             /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /MP3 /MDd /Zi /Ob0 /Od /RTC1
    Linker flags (Release):      /machine:x64  /INCREMENTAL:NO
    Linker flags (Debug):        /machine:x64  /debug /INCREMENTAL
    ccache:                      NO
    Precompiled headers:         NO
    Extra dependencies:
    3rdparty dependencies:

  OpenCV modules:
    To be built:                 calib3d core dnn features2d flann gapi highgui imgcodecs imgproc ml objdetect photo stitching video videoio world
    Disabled:                    python2 python3
    Disabled by dependency:      -
    Unavailable:                 java ts
    Applications:                apps
    Documentation:               NO
    Non-free algorithms:         NO

  Windows RT support:            NO

  GUI:
    Win32 UI:                    YES
    VTK support:                 NO

  Media I/O:
    ZLib:                        build (ver 1.2.12)
    JPEG:                        build-libjpeg-turbo (ver 2.1.2-62)
    WEBP:                        build (ver encoder: 0x020f)
    PNG:                         build (ver 1.6.37)
    TIFF:                        build (ver 42 - 4.2.0)
    JPEG 2000:                   build (ver 2.4.0)
    OpenEXR:                     build (ver 2.3.0)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    DC1394:                      NO
    FFMPEG:                      YES (prebuilt binaries)
      avcodec:                   YES (58.134.100)
      avformat:                  YES (58.76.100)
      avutil:                    YES (56.70.100)
      swscale:                   YES (5.9.100)
      avresample:                YES (4.0.0)
    GStreamer:                   NO
    DirectShow:                  YES
    Media Foundation:            YES
      DXVA:                      YES

  Parallel framework:            Concurrency

  Trace:                         YES (with Intel ITT)

  Other third-party libraries:
    Intel IPP:                   2020.0.0 Gold [2020.0.0]
           at:                   C:/build/master_winpack-build-win64-vc15/build/3rdparty/ippicv/ippicv_win/icv
    Intel IPP IW:                sources (2020.0.0)
              at:                C:/build/master_winpack-build-win64-vc15/build/3rdparty/ippicv/ippicv_win/iw
    Eigen:                       NO
    Custom HAL:                  NO
    Protobuf:                    build (3.19.1)

  OpenCL:                        YES (NVD3D11)
    Include path:                C:/build/master_winpack-build-win64-vc15/opencv/3rdparty/include/opencl/1.2
    Link libraries:              Dynamic load

  Python (for build):            C:/utils/soft/python27-x64/python.exe

  Java:
    ant:                         C:/utils/soft/apache-ant-1.10.12/bin/ant.bat (ver 1.10.12)
    JNI:                         C:/utils/soft/jdk1.8.0_333/include C:/utils/soft/jdk1.8.0_333/include/win32 C:/utils/soft/jdk1.8.0_333/include
    Java wrappers:               NO
    Java tests:                  NO

  Install to:                    C:/build/master_winpack-build-win64-vc15/install

VS Cmake project:

General configuration for OpenCV 4.7.0 =====================================
  Version control:               unknown

  Platform:
    Timestamp:                   2023-05-16T10:33:52Z
    Host:                        Windows 10.0.19045 AMD64
    CMake:                       3.26.3
    CMake generator:             Ninja
    CMake build tool:            E:/VisualStudio2022CE/Common7/IDE/CommonExtensions/Microsoft/CMake/Ninja/ninja.exe
    MSVC:                        1935
    Configuration:               Debug

  CPU/HW features:
    Baseline:                    SSE SSE2 SSE3
      requested:                 SSE3
    Dispatched code generation:  SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
      requested:                 SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
      SSE4_1 (14 files):         + SSSE3 SSE4_1
      SSE4_2 (1 files):          + SSSE3 SSE4_1 POPCNT SSE4_2
      FP16 (0 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
      AVX (4 files):             + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
      AVX2 (30 files):           + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
      AVX512_SKX (5 files):      + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX

  C/C++:
    Built as dynamic libs?:      YES
    C++ standard:                11
    C++ Compiler:                E:/VisualStudio2022CE/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe  (ver 19.35.32217.1)
    C++ flags (Release):         /nologo /DWIN32 /D_WINDOWS /W4 /utf-8 /GR /MP   /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise /FS     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819  /MD /O2 /Oi /Gy /DNDEBUG /Z7
    C++ flags (Debug):           /nologo /DWIN32 /D_WINDOWS /W4 /utf-8 /GR /MP   /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise /FS     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819  /D_DEBUG /MDd /Z7 /Ob0 /Od /RTC1
    C Compiler:                  E:/VisualStudio2022CE/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe
    C flags (Release):           /nologo /DWIN32 /D_WINDOWS /W3 /utf-8 /MP   /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise /FS       /MD /O2 /Oi /Gy /DNDEBUG /Z7
    C flags (Debug):             /nologo /DWIN32 /D_WINDOWS /W3 /utf-8 /MP   /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise /FS     /D_DEBUG /MDd /Z7 /Ob0 /Od /RTC1
    Linker flags (Release):      /machine:x64  /nologo /DEBUG /INCREMENTAL:NO /OPT:REF /OPT:ICF    /debug
    Linker flags (Debug):        /machine:x64  /nologo    /debug /INCREMENTAL
    ccache:                      NO
    Precompiled headers:         NO
    Extra dependencies:
    3rdparty dependencies:

  OpenCV modules:
    To be built:                 calib3d core dnn features2d flann highgui imgcodecs imgproc ml objdetect photo stitching video videoio
    Disabled:                    world
    Disabled by dependency:      -
    Unavailable:                 gapi java python2 python2 python3 ts
    Applications:                -
    Documentation:               NO
    Non-free algorithms:         NO

  Windows RT support:            NO

  GUI:                           WIN32UI
    Win32 UI:                    YES

  Media I/O:
    ZLib:                        optimized E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/lib/zlib.lib debug E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/debug/lib/zlibd.lib (ver 1.2.13)
    JPEG:                        optimized E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/lib/jpeg.lib debug E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/debug/lib/jpeg.lib (ver 62)
    WEBP:                        (ver 1.3.0)
    PNG:                         optimized E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/lib/libpng16.lib debug E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/debug/lib/libpng16d.lib (ver 1.6.39)
    TIFF:                        optimized E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/lib/tiff.lib debug E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/debug/lib/tiffd.lib (ver 42 / 4.5.0)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    DirectShow:                  YES
    Media Foundation:            YES
      DXVA:                      YES

  Parallel framework:            Concurrency

  Trace:                         YES (built-in)

  Other third-party libraries:
    Custom HAL:                  NO
    Protobuf:                    optimized E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/bin/libprotobuf.dll debug E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/debug/bin/libprotobufd.dll   version (3.21.12.0)

  OpenCL:                        YES (NVD3D11)
    Include path:                E:/VSLibs/vcpkg/buildtrees/opencv4/src/4.7.0-87379d1df6.clean/3rdparty/include/opencl/1.2
    Link libraries:              Dynamic load

  Python (for build):            NO

  Install to:                    E:/VSLibs/vcpkg/packages/opencv4_x64-windows/debug

Hope it will be helpful.

Thank You for answering!
You are correct. I was linking Cmake project following OpenCV Cmake installation guide and also added some lines to CmakeLists.txt suggested by vcpkg manager during OpenCV installation process (via Cmake and vcpkg.json), namely,
target_link_libraries(my_project PRIVATE opencv_ml opencv_dnn opencv_core opencv_flann)

wait, how did those libs get on your box ?
(opencv does not maintain anything vcpkg related)

those were probably built without any optimization, again, please check & show getBuildInformation() results

berak, thank You for fast response!
Regarding Your question - I don’t know. It just works :smiley:
Here is my vcpkg.json:

{
  "name": "3ximageenhancementcmakecopy",
  "version": "1.0",
  "dependencies": [
    "opencv",
    "dlib",
    "eigen3",
    "gsl",
    {
      "name": "imgui",
      "features": [ "glfw-binding", "opengl3-binding" ]
    },
    "glfw3",
    "glad"
  ]
}

And full CmakeLists.txt:

# CMakeList.txt : CMake project for my_project, include source and define
# project specific logic here.
#
find_package(dlib CONFIG REQUIRED) 
find_package(Eigen3 CONFIG REQUIRED)
find_package(glad CONFIG REQUIRED)
find_package(glfw3 CONFIG REQUIRED)
find_package(GSL REQUIRED)
find_package(imgui CONFIG REQUIRED)
find_package(OpenCV CONFIG REQUIRED)
include_directories( ${OpenCV_INCLUDE_DIRS} )

# Add source to this project's executable.
add_executable (my_project "main.cpp" "gui.h" "gui.cpp")

if (CMAKE_VERSION VERSION_GREATER 3.12)
  set_property(TARGET my_project PROPERTY CXX_STANDARD 20)
endif()

# TODO: Add tests and install targets if needed.
target_link_libraries(my_project PRIVATE dlib::dlib)
target_link_libraries(my_project PRIVATE Eigen3::Eigen)
target_link_libraries(my_project PRIVATE glad::glad)
target_link_libraries(my_project PRIVATE glfw)
target_link_libraries(my_project PRIVATE GSL::gsl GSL::gslcblas)
target_link_libraries(my_project PRIVATE imgui::imgui)
target_link_libraries(my_project PRIVATE opencv_ml opencv_dnn opencv_core opencv_flann)
target_link_libraries(my_project PRIVATE ${OpenCV_LIBS})

The getBuildInformation() is provided above in the answer to laurent.berger.

Maybe CmakePresets.json can be in handy, so I will provide it below as well:

{
    "version": 3,
    "configurePresets": [
        {
            "name": "windows-base",
            "hidden": true,
            "generator": "Ninja",
            "binaryDir": "${sourceDir}/out/build/${presetName}",
            "installDir": "${sourceDir}/out/install/${presetName}",
            "cacheVariables": {
                "CMAKE_C_COMPILER": "cl.exe",
                "CMAKE_CXX_COMPILER": "cl.exe",
                "CMAKE_TOOLCHAIN_FILE": {
                      "value": "$env{VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake",
                      "type": "FILEPATH"
                }
            },
            "condition": {
                "type": "equals",
                "lhs": "${hostSystemName}",
                "rhs": "Windows"
            }
        },
        {
            "name": "x64-debug",
            "displayName": "x64 Debug",
            "inherits": "windows-base",
            "architecture": {
                "value": "x64",
                "strategy": "external"
            },
            "cacheVariables": {
                "CMAKE_BUILD_TYPE": "Debug"
            },
            "environment": {
                "VCPKG_ROOT": "E:/VSLibs/vcpkg"
            }
        },
        {
            "name": "x64-release",
            "displayName": "x64 Release",
            "inherits": "x64-debug",
            "cacheVariables": {
                "CMAKE_BUILD_TYPE": "Release"
            }
        },
        {
            "name": "x86-debug",
            "displayName": "x86 Debug",
            "inherits": "windows-base",
            "architecture": {
                "value": "x86",
                "strategy": "external"
            },
            "cacheVariables": {
                "CMAKE_BUILD_TYPE": "Debug"
            }
        },
        {
            "name": "x86-release",
            "displayName": "x86 Release",
            "inherits": "x86-debug",
            "cacheVariables": {
                "CMAKE_BUILD_TYPE": "Release"
            }
        }
    ]
}

Everything is working, except the speed loss in the inverse dft.

first configuration

Configuration:               Debug Release

Second only debug?

first configuration

  Other third-party libraries:
    Intel IPP:                   2020.0.0 Gold [2020.0.0]
           at:                   C:/build/master_winpack-build-win64-vc15/build/3rdparty/ippicv/ippicv_win/icv
    Intel IPP IW:                sources (2020.0.0)
              at:                C:/build/master_winpack-build-win64-vc15/build/3rdparty/ippicv/ippicv_win/iw
    Eigen:                       NO
    Custom HAL:                  NO
    Protobuf:                    build (3.19.1)

and second

  Other third-party libraries:
    Custom HAL:                  NO
    Protobuf:                    optimized E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/bin/libprotobuf.dll debug E:/VSChinaProjects/3xImageEnhancementCmake/out/build/x64-debug/vcpkg_installed/x64-windows/debug/bin/libprotobufd.dll   version (3.21.12.0)

Why haven’t you got ipp?

2 Likes

Thank You for support!
Yes, there are debug/release and debug only configurations at the moment. But i’ve tested both in debug and both in release modes. iDFT slowdown persisted.
Considering IPP - to be honest, I’m not familiar with it. But as I understand it can speed up iDFT through optimized instructions, built specifically for Intel processors?
If I want to add IPP support, I assume I need to modify the CmakePresets.json with something like: “-D WITH_IPP=ON” command line option?

When I use cmake with opencv IPP is downloaded
with your json file I have no idea how to do

Got it. I’ll try to add it somehow :slight_smile:
Hopefully, this will fix my issue. I will report results as soon as IPP will be enabled.
Thank You!

Laurent Berger, thank You very much for Your suggestion!
You saved my day :slight_smile:
Enabling IPP really did the trick. Now iDFT processing’s time (3ms versus 180ms+) is back!
For those, who may encounter similar problem.
Here how i fixed it (modified vcpkg.json):

{
  "name": "tempproj",
  "version": "1.0",
  "dependencies": [
    {
      "name": "opencv",
      "platform": "(windows & x64)",
      "features": [ "ipp" ]
    },
    "dlib",
    "eigen3",
    "gsl",
    {
      "name": "imgui",
      "features": [ "opengl3-binding", "glfw-binding" ]
    },
    "glad",
    "glfw3"
  ]
}

Added features with ipp in it. Also do mind that one may need to rebuild the project, since the code (in my case) persisted the same even after rebuilding cmake cache.

2 Likes