Hi
I’m building an application that uses OpenCV’s OpticalFlowDual_TVL1_Impl. It is basically gets sequence of image frames, and do optical flow motion vector estimation. While I was doing performance profiling using Nsight System, I found that there’s memory allocation and freeing everytime I’m calling OpticalFlowDual_TVL1_Impl::calc.
I found where the memory allocation happens in the source code:
void OpticalFlowDual_TVL1_Impl::calc(InputArray _frame0, InputArray _frame1, InputOutputArray _flow, Stream& stream)
{
const GpuMat frame0 = _frame0.getGpuMat();
const GpuMat frame1 = _frame1.getGpuMat();
BufferPool pool(stream);
**GpuMat flowx = pool.getBuffer(frame0.size(), CV_32FC1);**
**GpuMat flowy = pool.getBuffer(frame0.size(), CV_32FC1);**
calcImpl(frame0, frame1, flowx, flowy, stream);
GpuMat flows[] = {flowx, flowy};
cuda::merge(flows, 2, _flow, stream);
}
The BufferPool’s getBuffer method is keep doing memory allocation, and deallocates it automatically when it is out of scope every time the ‘calc’ is called.
I’ve looked up the BufferPool, and it is basically memory pool that tries to reduce actual memory allocation/deallocation calls.
So, I have followed the example on here: OpenCV: cv::cuda::BufferPool Class Reference
...
setBufferPoolUsage(true);
setBufferPoolConfig(getDevice(), 1024 * 1024 * 64, 2);
Stream stream1;
...
cv::Ptr<cv::cuda::OpticalFlowDual_TVL1> motionEstimatorCUDA;
motionEstimatorCUDA = cv::cuda::OpticalFlowDual_TVL1::create(tau, lambda, theta, nscales, warps, epsilon, iterations, scaleStep, gamma, useInitialFlow);
...
...
motionEstimator->calc(mat1, mat2, outputFlow);
But it is still keep allocating/deallocating memories whenever pool.getBuffer is called.
What am I missing?