My main purpose is to cut the original img in sticker size, and recognize img to text, find if is duplicate or print wrong.
below python script can locate all sticker’s corner, and successfully OCR pytesseract img to text
Question: I want optimize the speed, if there is 50 stickers then the locate step has to do over 50 times, but if I can only get the (0,0) , (0,1), (1,0) stickers’ location, the rest I can eduction their (x,y) cause their distance between others is same.
I just mark the location (x,y) I want in pic link: (Python) just detect few corners instead of all from img, to optimize the speed - Album on Imgur
Pic 02 in link : the expect corner (x,y) I want to get
now I can get "all the sticker’s corner (x,y) by detect 50 times , and
list them out by executing “code line 26- line28” 50 times==> I want just execute “code line 26- line28” few times just get the 4 red X , in order to save the processing time
- .py script
import os
import cv2
import numpy as np
from PIL import Image
from PIL import ImageDraw
from PIL import ImageFont
import pytesseract
image = cv2.imread("/home/student_joy/desktop/optimization_11_10/original_duplicate.png")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
white_bg = 255*np.ones_like(image)
ret, thresh = cv2.threshold(gray, 60, 255, cv2.THRESH_BINARY_INV)
blur = cv2.medianBlur(thresh, 1)
kernel = np.ones((10, 20), np.uint8)
img_dilation = cv2.dilate(blur, kernel, iterations=1)
im2, ctrs, hier = cv2.findContours(img_dilation.copy(), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[0])
xy_list = []
listOfElems = []
listOfDuplicate = []
list_for_duplicate_x_and_y = [ ]
for i, ctr in enumerate(sorted_ctrs):
# Get bounding box
x, y, w, h = cv2.boundingRect(ctr)
roi = image[y:y + h, x:x + w]
if (h > 50 and w > 50) and h < 200:
cv2.rectangle(image, (x, y), (x + w, y + h), (255, 255, 255), 1)
print("= = = = = = = = ")
print("left top , and right down corner" )
print(x , y ) # left top corner (x , y )
print(x + w , y + h ) # right down corner (x , y ) = (x + w , y + h )
for xc in (45,150,255,360,465,570):
if xc-20 < x < xc+20:
x = xc + 26
break
else:
x = 0
for yc in (132, 243,586,357,470):
if yc-20 < y < yc+20:
y = yc + 48
break
else:
y = 0
print("new number" , x , y )
tem_list_x_and_y = [ ]
tem_list_for_duplicate_x_and_y = [ ]
if (x != 0) and (y != 0):
tem_list_x_and_y.append(x)
tem_list_x_and_y.append(y)
xy_list.append(tem_list_x_and_y)
w = 59
h = 23
new_crop = image[y:y+h, x:x+w]
text = pytesseract.image_to_string(new_crop, lang='eng').strip()
if text not in listOfElems:
listOfElems.append(text)
print(text)
print("= = = = = = = = ")
print(" ")
else:
print("Duplicate text is here:")
print(text)
print("x :" , x , "y :",y)
tem_list_for_duplicate_x_and_y.append(x)
tem_list_for_duplicate_x_and_y.append(y)
list_for_duplicate_x_and_y.append(tem_list_x_and_y)
print("= = = = = = = = ")
print(" ")
- output for 50 stickers
left top
, andright down corner's (x,y)
, andimg to text result
= = = = = = = =
left top , and right down corner
0 0
715 140
new number 0 0
= = = = = = = =
left top , and right down corner
44 472
135 542
new number 71 518
CAT4B5
= = = = = = = =
= = = = = = = =
left top , and right down corner
44 357
136 426
new number 71 405
CA7T4BB
= = = = = = = =
= = = = = = = =
left top , and right down corner
45 586
136 653
new number 71 634
CATAAF
= = = = = = = =
= = = = = = = =
left top , and right down corner
46 242
135 311
new number 71 291
CAT4C1
= = = = = = = =
= = = = = = = =
left top , and right down corner
50 132
139 198
new number 71 180
‘CAT4C7
= = = = = = = =
= = = = = = = =
left top , and right down corner
148 472
239 542
new number 176 518
CAT4B6
= = = = = = = =
= = = = = = = =
left top , and right down corner
149 587
240 654
new number 176 634
CAT4B0
= = = = = = = =
= = = = = = = =
left top , and right down corner
149 357
241 426
new number 176 405
CAT4BC
= = = = = = = =
= = = = = = = =
left top , and right down corner
150 243
241 311
new number 176 291
CAT4C2
= = = = = = = =
= = = = = = = =
left top , and right down corner
153 132
243 198
new number 176 180
CAT4C8
= = = = = = = =
= = = = = = = =
left top , and right down corner
253 588
342 655
new number 281 634
= = = = = = = =
= = = = = = = =
left top , and right down corner
253 473
343 543
new number 281 518
CAT4B7
= = = = = = = =
= = = = = = = =
left top , and right down corner
254 357
346 427
new number 281 405
Duplicate text is here:
x : 281 y : 405
= = = = = = = =
= = = = = = = =
left top , and right down corner
255 243
345 311
new number 281 291
CATAC3
= = = = = = = =
= = = = = = = =
left top , and right down corner
257 132
347 198
new number 281 180
CAT4C9
= = = = = = = =
= = = = = = = =
left top , and right down corner
357 588
447 656
new number 386 634
CAT4B2
= = = = = = = =
= = = = = = = =
left top , and right down corner
358 473
448 543
new number 386 518
CA7T4B8
= = = = = = = =
= = = = = = = =
left top , and right down corner
358 361
448 430
new number 386 405
CATACS
= = = = = = = =
= = = = = = = =
left top , and right down corner
359 243
450 312
new number 386 291
CATAC4
= = = = = = = =
= = = = = = = =
left top , and right down corner
360 132
452 198
new number 386 180
CATACA
= = = = = = = =
= = = = = = = =
left top , and right down corner
461 589
551 657
new number 491 634
CATABS
= = = = = = = =
= = = = = = = =
left top , and right down corner
462 474
552 544
new number 491 518
CAT4B9
= = = = = = = =
= = = = = = = =
left top , and right down corner
463 358
554 428
new number 491 405
CAT4BF
= = = = = = = =
= = = = = = = =
left top , and right down corner
463 243
554 312
new number 491 291
CAT4CS
= = = = = = = =
= = = = = = = =
left top , and right down corner
464 131
556 198
new number 491 180
CAT4CB
= = = = = = = =
= = = = = = = =
left top , and right down corner
566 589
655 658
new number 596 634
CAT4B4
= = = = = = = =
= = = = = = = =
left top , and right down corner
567 474
659 544
new number 596 518
CATABA
= = = = = = = =
= = = = = = = =
left top , and right down corner
567 361
658 430
new number 596 405
CATACE
= = = = = = = =
= = = = = = = =
left top , and right down corner
568 244
659 312
new number 596 291
Duplicate text is here:
CATACE
x : 596 y : 291
= = = = = = = =
= = = = = = = =
left top , and right down corner
568 131
660 199
new number 596 180
CAT4CC
= = = = = = = =
so the output showing that I did print out all the sticker’s corner (x,y) but sacrifice processing time, I just want 4 sticker corner (red X in share pic link ) to boost time, thanks
I also ask on stack overflow too, but I personally think OpenCV is also a great platform