OpenCV-cuda : run the same function in parallel on diferent data using streams?


I’m working on a stereo camera based application, where I do many processing steps once on left image, and once on right image.
In order to increase the maximal framerate, I’m starting to use the OpenCV-cuda functions wherever I can. For now, I use the synchronous versions (all functions blocking) and it works well, but it is far from optimal, as I run the code fist on the left image, then on the right image, where both could run in parallel.

I found an old presentation (, where on page 25 it states :

Current limitation: Unsafe to enqueue the same GPU operation multiple times


The limitation will be removed in OpenCV 3.x

Is this limitation now removed (for openCV 4.2+, if it is removed on 4.5 or 4.6 it’s also fine, I will just need to update the minium required version)?

In particular, I am allowed :

  1. to put upload, one or more different processings and download on a single stream if all work on the same data?
  2. to put 2 uploads (or 2 downloads) of different data in the same stream?
  3. to put 2 uploads (or 2 downloads) of different data in different streams?
  4. to execute the same function twice on different data in the same stream?
  5. to execute the same function twice on different data in different streams?
  6. to perform a computation in one stream on one data, and an upload/download on different data in another stream?

The current (4.6) documentation for the Stream class hints that the problem is not really solved yet : OpenCV: cv::cuda::Stream Class Reference

Currently, you may face problems if an operation is enqueued twice with different data. Some functions use the constant GPU memory, and next call may update the memory before the previous one has been finished. But calling different operations asynchronously is safe because each operation has its own constant buffer. Memory copy/upload/download/set operations to the buffers you hold are also safe.

So what exactly is allowed/not-allowed?
Is there any documentation on which functions are safe to be enqueued twice on same and/or different streams?

Thanks a lot in advance

Nice presentation, I hadn’t seen that one before.

I would say they didn’t every get around to removing that limitation. I can’t remember exactly what happened, but it definitely seemes like the cuda backend was deprioritized, maybe it had something to do with Intel taking the reigns maybe not.

Essentially some routines use constant and texture memory and some do not, and unfortunately the best way to check 100% is to look at the source code.

Simple routines such as resize, colour conversion, thresholding etc. are unlikely to have a problem but the routines you are interested in rectification, block matching etc. are more likely to.

Thanks a lot.

So the problem is still there (an unlikely to be solved soon).
So from what I undestand from the presentation, running the same function (if it uses constants or texture memory) twice on the same stream will be a problem.

But it isn’t clear to me if it is OK to run it twice on different streams?
And on a single stream, can I combine a function (using consts or textures) with others that don’t use them (I would guess I can)?
Can I run 2 different functions both using consts and textures at the same time (on same or different stream)?

Thanks a lot in advance

That is my interpretation, although I am not 100% sure why it would be an issue in the same stream but it would definitely be an issue in multiple streams.

It might help to think of constant and texture memory as global variables and a stream as a que of work which you can submit jobs to, where the order of execution winthin a stream (intra) is guaranteed. Additionally each que can run at the same time as any other que with no guarantee on the order of the jobs between the streams (inter). That is you could run function a() in stream 1 followed by b() in stream 2 but there is no guarantee which function will run first.

If you run the same function with different globals in two different ques of work then there is no guarantee that the global variable for function 1 in que 1 won’t be written over by function 2 in que 2 while funciton 1 is still executing.


Thanks a lot.

So basically, I can do everything excepted having 2 instances of the same function (using consts or textures) queued at the same time (on same or different stream)

Yes for different streams, I am still unsure about the same stream.

Let’s see if someone has more information about the same stream situation