I’m trying to utilize pytesseract or tesserocr to extract numbers from images. Have gone through few answers around this and have applied those, and those didn’t solve my problem.
For an example if I’ve a image like these
and want to extract the numbers contained within the small background (or tiles) is not working as expected. I’ve converted the picture to grayscale and tried with various PSM values like 6,7,8, etc., and each time it failed.
Tried drawing boxes around this as well using pytesseract.image_to_boxes
and pytesseract.image_to_data
, and didn’t yield the proper result. Also have tried cv2.adaptiveThreshold(img, 252, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 31, 7)
and didn’t work either.
In my project, not all numbers are going to be in this format, but right now it is failing to catch these, so any help here would be much appreciated.