I am trying to OCR photos of a 7 segment LCD, showing meter readings. Photos are taken from an ESP32-CAM and saved periodically. The OCR routine shall extract the numbers and optionally the decimal point. A pretty common task, I thought.
I tried some approaches like GitHub - arturaugusto/display_ocr: Real-time image preprocess and OCR. or gImageReader › Wiki › ubuntuusers.de. They all work, but a successful readout strongly depends on selecting the right section from the photo, or the correct threshold when converting gray to black&white, e.g. So parameters working for one picture will not work for another, maybe because lighting was slightly different.
I finally came to a simple CLI/bash version
for f in `ls pic_*.jpg` do convert $f -crop 330x112+100+48 out.jpg tesseract out.jpg - -l letsgodigital --psm 7 done
639758 for the first photo – pretty good (only 3. number wrong), but
-7, 6,-8- , for the second.
For the human eye the two photos aren’t so different. What makes the big difference for the OCR procedure?
Any help or suggestion is highly appreciated.
Best regards, Rupert
PS: Something like a confidence level as additional result would be very helpful. From 0 “poor picture, nothing to read” from 100% “perfect result, no doubt about it”.