I would like to know the meanings of cuda stream notes.
in the above page, the following is written as note:
Currently, you may face problems if an operation is enqueued twice with different data.
I want to call knnMatchAsync() for so many train descriptors, thousands of descriptors.
In this case, can I call knnMatchAsync() many times with a single cuda stream, or not?
If not, I have to create so many cuda stream objects.
How many cuda streams can be created simultaneously? Or, can I reuse cuda stream repeatedly?
My environment is as follows:
- os: linux/windows
- opencv: 4.4.0
- platform: java w/ org. bytedeco opencv-platform-gpu
- cuda: 11.2
- gpu: nvidia turing architecture
I am not sure which routines that refers to, if I were you I would inspect the source code for knnMatchAsync() to see if there are any global variables which are being set. That said I am pretty sure it is not possible for the following to happen
next call may update the memory before the previous one has been finished
as kernels and async memory operations launched in the same stream should be executed synchronously with respect to each other. Now if you issued the same operation to multiple streams that could be a problem if a global variable was used so it may be refering to the way npp (lots of the CUDA routines in OpenCV are built on top of npp libs) used to work pre CUDA 10.1 where all operations had to be performed in the same stream.
Alternatively this could be the way CUDA used to operate when streams were first implemented, although I don’t remember that.