# Using OpenCV for Image Similarity

I’m trying to compare two images and return a score based on how similar the second image is to the original. So, I watched several videos on how to do this, but nothing seems to return the correct answer because the closer the second image to the first one is, the lower the score gets.

My idea is to have `image 1` as the original image that will be used to compare the other images with. For example, images 2-4 are just for testing. The idea is to have a final image similar to `image 4` that looks very similar to `image 1` then compare both to see if `image 4` is somewhat similar to the original. After that get the score based on how similar 4 looks compared to 1.

Here is the code:
I’m using it from this link:
Python Compare Two Images

``````from skimage.metrics import structural_similarity as ssim
import matplotlib.pyplot as plt
import numpy as np
import cv2

def mse(imageA, imageB):
# the 'Mean Squared Error' between the two images is the
# sum of the squared difference between the two images;
# NOTE: the two images must have the same dimension
err = np.sum((imageA.astype("float") - imageB.astype("float")) ** 2)
err /= float(imageA.shape[0] * imageA.shape[1])

# return the MSE, the lower the error, the more "similar"
# the two images are
return err
def compare_images(imageA, imageB, title):
# compute the mean squared error and structural similarity
# index for the images
m = mse(imageA, imageB)
s = ssim(imageA, imageB)
# setup the figure
fig = plt.figure(title)
plt.suptitle("MSE: %.2f, SSIM: %.2f" % (m, s))
# show first image
plt.imshow(imageA, cmap = plt.cm.gray)
plt.axis("off")
# show the second image
plt.imshow(imageB, cmap = plt.cm.gray)
plt.axis("off")
# show the images
plt.show()

# load the images -- the original, the original + new,
# and the original + photoshop

# convert the images to grayscale
original = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY)
new= cv2.cvtColor(new, cv2.COLOR_BGR2GRAY)

# initialize the figure
images = ("Original", original), ("New", new)

# compare the images
compare_images(original, new, "Original vs. New")
``````

The problem with this is that if I test `image 1` with `image 2` it returns about an 80% match. But if I test `image 1` with `image 4` it returns a lower match at about 70%. Even though `image 1` and `image 4` look very similar, it’s still not the same image. It should have returned something close to 90%.

Any idea on how can I fix this or if there is another method is a lot easier. By the way, I just started to learn OpenCV.

mean squared error is a distance (smaller == better),
the opposite of “similarity” …

it’s even written verbatim in the comment …

1 Like

also, per-pixel differences are a bad metric for these kinds of sketches.

you can’t solve this by winging it. you need an education in computer vision to tackle such problems. I don’t mean learning a library, but learning the theory, and learning things that you don’t know exist.

1 Like

I see. So, that is another form of calculating the similarity, the MSE and SSIM.
After testing it I realized that MSE gave a more accurate percentage than SSIM.
However, image 3 gave a lower percentage (or higher score for MSE) than image 2, but image 4 gave a higher percentage (lower MSE score).

path-to-opencv\sources\doc\tutorials\videoio\video-input-psnr-ssim

you could extract the line points:

``````inv = ~img # white on black
points = cv2.findNonZero(inv)
``````

then use shape matching , e.g hausdorff distance

``````haus = cv2.createHausdorffDistanceExtractor()
r = haus.computeDistance(pts_a, pts_b)``````
1 Like

I tried doing that and after running the scripts nothing happens.

``````import cv2

pts_a = cv2.findNonZero(img1)
pts_b = cv2.findNonZero(img2)

hd = cv2.createHausdorffDistanceExtractor()

d1 = hd.computeDistance(pts_a, pts_b)

print(d1)
``````

i dont believe that …

and you missed the inverting step
(you need white-on-black lines to use findNonZero())

Sorry, but I’m new to these functions in Python.
What do you mean I missed the inverting step?
Inverting what exactly?

Also, I did convert the images to grayscale (white on black, unless that’s not what it is).

black is 0 is `false` background. white is non-zero is `true` is foreground. that’s the basic assumption everywhere.

this is the inverted version of your first image:

1 Like

I did that, but I’m getting an error after resizing the images. If I don’t resize them then nothing prints out like before.

``````import cv2

img1 = cv2.bitwise_not(img1)
img2 = cv2.bitwise_not(img2)

img1 = cv2.resize(img1, (100, 200))
img2 = cv2.resize(img2, (100, 200))

pts_a = cv2.findNonZero(img1)
pts_b = cv2.findNonZero(img2)

hd = cv2.createHausdorffDistanceExtractor()
d1 = hd.computeDistance(pts_a, pts_b)

print(d1)
``````

Error:
`line 12, in <module> pts_a = cv2.findNonZero(img1) cv2.error: OpenCV(4.5.4) D:\a\opencv-python\opencv-python\opencv\modules\core\src\count_non_zero.dispatch.cpp:160: error: (-215:Assertion failed) src.channels() == 1 && src.dims == 2 in function 'cv::findNonZero'`

this results in a 3 channel image. use:

``````cv2.imread("Test\i1.png", cv2.IMREAD_GRAYSCALE)
``````

instead for the required single channel input

That worked, thanks!

I realized that it doesn’t show anything when I don’t resize the image.
If I try to resize it (100, 200) it returns the value, but if I try (200, 300) it takes forever and the value changes. It appears that it increases based on the image size.

1 Like