Matching and Code improvement questions

Hello, I tried to do some research into the topic, but didnt find much. I have two questions really.

  1. Is there a faster more accurate method to finding an image inside an image that doesnt need scale or rotation differences? Mainly, I know that TemplateMatching doesnt really account for color images and thus the needle seems to appear in places that are clearly not in that location so my gresults always return something like 1800 possible objects that I then have to compare to maybe find 7 (but sometimes it finds 8 or 6 depending on the precent) out of 7 that should be there. I also wouldnt need to find color ranges either. Just simply, find all locations of this one needle.

  2. Just wondering if there is a way to speed up line two here - more with numply.where that can happen on the GPU instead of waiting to get the result then trying to compare on the CPU? Not really sure if it would net me any speed improvements - but I figure if everything can be done on the GPU that should help cover the overhead of sending the job to the GPU at least.

gresult = cv2.cuda.createTemplateMatching( cv2.CV_8UC1, cv2.TM_SQDIFF_NORMED ).match( self.frame_buffer_cuda, template ).download()
points = [[25 + pt[0], 225 + pt[1]] for pt in numpy.column_stack( numpy.where( gresult <= ( 1 - precision ) )[::-1] )]

you know wrongly. it’s perfectly capable of that.

perhaps the CUDA flavor doesn’t (I don’t know it) but the regular matchTemplate does.

wrong TM mode? all the CCORR modes are ill suited to image comparison. only the SQDIFF modes make sense. and you shouldn’t trust any “NORMED” modes. figure the thresholds yourself. and don’t forget to perform NMS. use the matching score for that, don’t just merge bounding boxes.

for CUDA specifically, you’ve already found the code.

for OpenCL, work with UMat, which allows many functions to transparently use OpenCL kernels on any capable “devices” (GPUs), but also CPUs.

Thanks for the quick reply.

Awesome to hear. Sorry I miss understood how it work then. I read someplace it didnt and during testing, I saw it would find a similar object that was color blue instead of red until I increase the threshold to filter it out. Some of the current objects it finds also are the same just different color ranges and not duplicated location (such needing to perform NMs).

Correct, I am currently only using cv2.TM_SQDIFF in this case. As for cv2.TM_SQDIFF vs cv2.TM_SQDIFF_NORMED, I was using NORMED because it returning less results over the none NORMED version that better performance because I didnt have to loop over them all when testing similarity or the numply.where result.

I assume there isnt a way to force these methods to account for color a bit more tighter?

As for using NMS - I will look into that next. The examples I saw didnt use this so this is the first time hearing of this. Are there any other methods I should be using as well?

As for the numpy.where question, I guess there isnt a way to perform that action on the GPU instead?

browse the matchtemplate tag for previous discussions, including NMS.

GPUs can do some things but not others. they’re big vector processors. the essence of np.nonzero (np.where has two different modes…) is not that well suited to GPUs.

questions about numpy are best asked in places where that’s a specialty.


Going over a few of them right now.

Understood. I mostly was just seeing if there was a cross over of numpy.where either in opencv that would function on the GPU. I did run across someone saying to use nonzero, just wasnt sure if that was possible to do. Where I am sure they would say the same thing about asking here:)

an example of matching color image is here

not sure what that does differently from regular matchTemplate. appears to involve quaternions, so perhaps it’s supposed to be invariant to color changes? docs are bare, essentially nonexistent. care to elaborate? having to use git blame to find the pull request that does mention some references doesn’t help make that code accessible.

regular matchTemplate, with TM_SQDIFF, simply takes element-wise differences. if something is a different color (but has the same appearance otherwise), that’ll cause differences, same as if there were no instances of the template at all.

if you want colors to not matter, a simple conversion to grayscale might already do the trick…

To be clear, yes I want it to match color to color. If anything - I want it to be a bit more accurate than what the default TM_SQDIFF provides when searching for the needle in the hey stack, but I assume there isnt a way to improve or another method outside those three options?

total: 0.049269914627075195 a: 0.037168264389038086 b: 0.01210165023803711
total: 0.0786600112915039 a: 0.05359148979187012 b: 0.02506852149963379
Total time it takes to do A) Cuda.Temp.Match, B) Filter the results with numpy.where. Just seems like a lot of time is wasted trying to filter the results.

I did try the above code (ColorMatching) for matching. While it might work, its very slow:)

I did take some code that crackwitz made for the mask for the CPU section. I don’t see a way to add a mask for the GPU function though? Is that normal?

MRE please. I can’t even verify your claims, let alone debug your code.