Pytesseract identifies "q" as "a" and "i" as "I"

Hi,

I have this captchas I wan’t to solve using pytesseract and opencv:

bad1

I get:
8al

This is my code:

import cv2 as cv
import pytesseract

img = cv.imread('resize.png')
gray = cv.cvtColor(img, cv.COLOR_RGB2GRAY)
_, th = cv.threshold(gray, 115, 255, cv.THRESH_BINARY)

print(pytesseract.image_to_string(th, lang="eng", config='--psm 10 -c tessedit_char_whitelist=0123456789abcdefghijklmnopqrstuvwxyz'))

Is there anything I could do to better the output?

Welcome.

You have reached the forum for OpenCV, not Tesseract.

Perhaps you would like to talk to the fine people on https://groups.google.com/g/tesseract-ocr instead

Further, that looks like a CAPTCHA. CAPTCHAs are security measures that separate humans from robots. Defeating security measures seems unethical. If you can do it, nobody will stop you. I wouldn’t expect much assistance though.

Also: crosspost:

2 Likes