How to use custom OpenCL Queue for TAPI

logan · July 27, 2021, 6:49pm

I’m using OpenCV 4.5.2. When you use the OpenCV CUDA API you’re able to wrap a cudaStream_t with cv::cuda::StreamAccessor::wrapStream(<cudaStream_t pointer>) and all of the cv::cuda:: API takes a cv::cuda::Stream. This lets you control what stream each operation is running in.

Now I’m implementing the OpenCL version of the CUDA backend.

I’m already using multiple OpenCL command queues for various jobs to parallel the CUDA streams implementation. Now I’m trying to use some OpenCV image processing APIs and I want them to run in specific command queues similar to how we use CUDA streams.

A few questions around this:

How can I wrap or transfer a command_queue created with the OpenCL API to OpenCV?
How can I make sure TAPI functions like cvtColor run in a specific cv::ocl::Queue?
2.a. I see the TAPI functions just grab the default Context/Queue, so maybe the question is how to set my own OpenCL command queue as the default Queue temporarily so the job goes to the right queue, and then I can set it back?

logan · July 30, 2021, 10:26pm

I think I found a solution to question 2. It’s true the TAPI functions grab the default Context/Device/Queue whenever they require those objects. Those defaults are accessed based on the currently bound OpenCLExecutionContext. In order to have TAPI calls run in a specific Queue you can bind a new OpenCLExecutionContext that was created with the Queue you want active. This is the easiest way I’ve found to do this:

auto ctx = cv::ocl::OpenCLExecutionContext::getCurrent();
auto qctx = ctx.cloneWithNewQueue(myQueue);
cv::ocl::OpenCLExecutionContextScope s(qctx);
// call any TAPI calls like cvtColor, merge, split, add, multiply, etc.

“cloneWithNewQueue” reuses the underlying Context and Device and if you pass in a Queue it will be used by TAPI calls once you .bind() that context or use a OpenCLExecutionContextScope like I did above.

As for question 1, I think it’s not possible to convert a cl_command_queue that was created with the OpenCL API to an OpenCV cv::ocl::Queue–at least in the version I’m using 4.5.2. That means you have to use OpenCV to create your Queues at the lowest level. It’s one-way: you can get cl_command_queue from a cv::ocl::Queue, but you can’t get a cv::ocl::Queue from a cl_command_queue. I wish this was more like CUDA where you can easily convert from cudaStream_t to the OpenCV CudaStream. Having to go through OpenCV to make the Queues prevents you from creating those queues with extra OpenCL properties clCreateCommandQueueWithProperties.

Topic		Replies	Views
Enhance feature: ocl::setUseOpenCL(false) releases refcount/resources opencl	4	1228	February 13, 2021
How to ensure that GPU memory is actually deallocated after an OpenCV T-API function call?	4	1633	December 16, 2020
Switch cv::ocl context C++ opengl , opencl	0	443	January 18, 2022
Using opencv on Intel graphics gpu	1	4202	January 17, 2022
Create official samples from advanced OpenCL-Interop demos (VAAPI/OpenGL) C++ opencl	0	421	November 11, 2022

How to use custom OpenCL Queue for TAPI

Related topics