P-solver wrote: ↑Fri Apr 03, 2020 5:21 am
nigelbartlett1 wrote: ↑Thu Apr 02, 2020 12:10 pm
Try Tesseract. I tried it to read low resolution scans of a paper form, looking for a particular 13 digit number. It was accurate about 50% of the time. If you can accept single digit errors then that bumped up the success rate to just below 80%. Never got a chance to try it with higher resolution scans; certainly worth a go.
Code: Select all
sudo apt -y install tesseract-ocr
# For help:
man tesseract
Hi and thanks! This is exactly what I'm looking for.
I have now installed and setup my Pi and will do some tests with Tesseract this weekend.
I'll be back with more info on how the end result is turning out.
Ok, first report regarding test with tesseract.
Equipment at the moment used for testing:
- A "manual" photo of the display from my phone (since I haven't got the pi-camera delivered yet)
- Raspberry Pi 4
- Installed/Downloaded/updated software: Rasbian, Tesseract, Pillow
- Some Python code
How I did:
1. I put the image "IMG_0126" on the Pi desktop in Rasbian
2. Created the code "img_conv.py" with Pyton and saved it on the Pi desktop in Raspbian
3. Ran the code and got a output in form of the image "bw_display", and also a string of text showing "2829)" (Yes missing one character)
Important notes:
- Regarding the pytesseract, I got some real strange values of its output until I captured the "displays numbers only".
So I guess the importance in carrying on this part of the project is:
1. Have a really good image to start with. Luckily the display has background light, so I'm thinking of making a "screen or protection" from the Pi camera that covers the display from any reflections of existing lights.
2. Trying to capture only the text in the image that you want, when processing the image will help for a good end result.
More updates will come as I get my Pi camera.
img_conv.py
Code: Select all
from PIL import Image
from PIL import ImageFilter
import pytesseract
# open the "original" image
orig_image = Image.open('IMG_0126.jpg')
# capture the area of the text to "read" by setting "top left" and "right bottom" values in the image
left = 1660
top = 1200
right = left + 620
bottom = top + 180
# copy the captured area
display = orig_image.crop((left, top, right, bottom))
# Modify the image of your "text" to b/w etc to get an image as smooth and "easy readable" as possible, and save it as "code_bw"
gray = display.convert('L')
blackwhite = gray.point(lambda x: 0 if x < 66 else 255, '1')
blackwhite.save("code_bw.jpg")
# open the just saved file and smoothen it some more....
im = Image.open('code_bw.jpg')
smooth_im = im.filter(ImageFilter.SMOOTH_MORE)
# "read" the image with pytesseract and only allow numbers etc.
text = pytesseract.image_to_string(smooth_im, config='--psm 13 --oem 3 -c tessedit_char_whitelist=()0123456789')
# print result from your OCR-reading
print(text)
IMG_0126:

- display.PNG (192.83 KiB) Viewed 268 times
bw_display:

- bw_display.PNG (16.76 KiB) Viewed 268 times