CUDA Fast detector sometimes finds less keypoints from a certain Y

I have found that sometimes, depending on the picture when I apply CUDA Fast, it detects less keypoints than the CPU version.

I am using png pics of 1448*648 pixels. (288.6KB) and when I apply my algorithm CUDA Fast finds (around- it is not consistent) 1591 points while the CPU finds consistently 1826 keypoints.

And the curios things is that all the keypoints found by CUDA are with Y<524 (that means that all keypoints with Y>524 are not found)

what could be happening here?

Apparently it is a problem as reported here although it was deemed “not a bug”

I tried this solution:

cv::gpu::FAST_GPU detector(20, true, 1); // '1' prohibits throwing away any keypoints

and my program crashed

error (-215:Assertion failed) type==TYPE_9_16 in function 'create'

Apparently CUDA fast only accepts TYPE_9_16 and not other type

Any thoughts on this? Is there a solution for this?

The signature for the function is

static Ptr< cuda::FastFeatureDetector > create (int threshold=10, bool nonmaxSuppression=true, int type=cv::FastFeatureDetector::TYPE_9_16, int max_npoints=5000)

You are passing a 1 for the type which resolves to TYPE_7_12. The only type which is implemented is 2, TYPE_9_16. I don’t know if the proposed solution still works but I would definitely guess it pre-dates the type being an input to the function.