Functions best suited for OpenCL UMat Transparent API use

I am in the process of porting from C-API OCV 2.4 to OCV 3.4 C++ API and want to take advantage of UMat and the OpenCL Transparent API.

Has anyone ever compiled a list of the functions that would probably best be suited to the T-API treatment. The most cpu/gpu intensive functions? With 10s of thousands of lines of code in our existing app, I thought that it would be good to have a list that we could search for to study and decide if T-API would be useful there. Thanks for any suggestions.


use github search

actually, you want to keep your data on the GPU as long as possible, else up/downloading your data will eat up all speedup.

so, use UMat all over, it will also work nicely on the CPU, but comes with some restrictions:

  • can’t use operators, like a * b + c (have to use functions like gemm(), add(), etc.
  • no “random access” (per pixel) ops possible

Yes, I’ve timed a few lines using high_resolution_clock on cv::Scalar mean() and SimpleBlobDetector detect and it seems that, when it is just a one line (function) the UMat is slower. I assume because of the time used to copy to GPU memory. However, when it is a bunch of functions in a row like GaussianBlur, cvtColor, minMaxLoc, threshold, erode, dilate, findContours, etc., all happening on the same UMat. then I notice the speedup.

However, I have a lot of those one line functions. I would think that keeping UMats active would waste a lot of time copying memory back and forth between CPU and GPU when the speed of the GPU would then be wasted.

For now I’m looking for clusters of OCV functions where OCL might come in handy. Going line by line through the code is proving to be a real pain. That mean() function, for instance, surprised me. I had forgotten about it. Ran the speed test and decided to leave it as it is.

Rather than going throw the whole list of OCV functions and trying to jog my memory of possible things to look for, I was hoping for a shorter list of the cpu cycle intensive functions.


profile your code. that’ll tell you what functions take how long.

You’re right. That is a good idea. At least it will point to possible locations and then, assuming OCV code is involved, run some timing tests to see if UMat would help. Thanks again.