Image Grayscale (tesseract OCR)

walmeida · August 5, 2023, 4:15am

Hi guys! I am trying extracts text from a screenshot in memory using pytesseract without read/write file on disk.

this is my screenshot:

so, take a look two grayscale images between cvtColor and imread we see that diferents.
from gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) my threash
limiar, imgThreash = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU) does not work. Someone has any idea how ca i by pass this to achieve threash screenshot without save in file before to get it from imread?

PS: I did try upload all other image examples but the website block me because Im newer user!

Best regards!

crackwitz · August 5, 2023, 2:05pm

Dima_Fantasy was an AI spam bot. do not trust anything the AI spam bot posted.

walmeida · August 5, 2023, 2:11pm

@crackwitz, sorry about that. I did not see that.
thanks.

walmeida · August 6, 2023, 12:57pm

I finally found a solution to my problem.

Follows the code:

def ThresholdFromScreenShot(tupleCoordenates):

pixels = np.array(ImageGrab.grab(bbox=tupleCoordenates))

gray_f = np.array(Image.fromarray(pixels).convert('L'))

limiar, imgThreash = cv2.threshold(gray_f, 127, 255, 
cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

gray_s = np.array(Image.fromarray(imgThreash).convert('L'))
            
blur = cv2.blur(gray_s,(3,3))

limiar,thresh = cv2.threshold(blur,240,255,cv2.THRESH_BINARY)
    
return thresh

crackwitz · August 6, 2023, 2:23pm

that is equivalent to using cv.cvtColor()

also… you use ImageGrab already. why convert to numpy array and then back to PIL Image? you could just call convert("L") directly on that

that seems entirely superfluous

and those two threshold() calls could be just one, followed by a the “dilation” morphology operation

walmeida · August 6, 2023, 3:23pm

Hi @crackwitz, I’m newer programmer python and opencv, but the grayscale result from cvtcolor give me a result from red color more close of gray and np.array(Image.fromarray(pixels).convert(‘L’)) give me a result from red color more close of white! In the first case pytesseracts has no effect on the text extraction.

Thanks good!

crackwitz · August 6, 2023, 9:22pm

crosspost:

Topic		Replies	Views
Extracting grayscale images from a video using opencv Python grayscale , imwrite , video	3	2480	May 5, 2021
Separate the text from the others elements on the image Python	5	713	May 22, 2022
cvtColor() without imgread() Python imgproc	2	426	April 3, 2021
(Python , OpenCV) Extract picture from picture Python	2	1204	November 7, 2022
Improve text extraction Python ocr , imgproc , tesseract	3	542	July 20, 2022

Image Grayscale (tesseract OCR)

Related topics