Different match results for each orb cuda detection

I’m trying to speed-up feature matching process with orb feature using cuda interfaces.

After a few attempts, I noticed that the matching results for the same set of images were different for each attempt. Is this behavior natural when cuda is used?

On closer inspection, it seems that the result of detectAndCompute() for the same image is different each time. I want to know why this happens. Does this behavior have anything to do with the concurrency characteristics of gpu?

My goal is to find the best matching image among 1,000 images in less than a second. I’m using nfeatures = 2000 for query image and nfeatures = 1000 for train images. These parameter sets are determined in order to meet the accuracy requirements. Achieving my goal, I believe that speeding-up with gpu is essential, but this behavior is a cause for concern.

I would be very grateful if you could respond to my question.

What do you mean exactly?
when detectAndCompute is called twice with same data results are different or
when detectAndCompute is called without cuda and with cuda descriptor results are different?

there is this first issue
there is this second issue

Hi, laurent,
Thank you so much for your reply.

I mean the former case you describe.
Calling detectAndComputeAsync() twice for same image, I found the difference between the first-obtained descriptor and the second-obtained descriptor. Is this behavior unavoidable when using gpu?

I want to know why this behavior happens and how to avoid it if possible for stability.

The second link you showed might answer my question, so I will check it.


1 Like