I am using tesseract to detect orientation of scanned images. They mostly contains text.
It works in general but sometimes fails, even for simple cases (eg: text is clear).
When it fail I found out that most of the time ‘script’ is wrongly detected (eg: as ‘Cyrillic’ or ‘Arabic’).
'page_num': 0,
'orientation': 0,
'rotate': 0,
'orientation_conf': 0.03,
'script': 'Cyrillic',
'script_conf': 1.48
I know the text in advance which is ‘Latin’ and ‘French’ or ‘English’.
Is there a way to specify that to tesseract ? I found that it’s possible to specify it while converting image to text but not when detecting orientation.
Here is the code I use:
img = cv2.imread(filename)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #convert to black and white
img = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1] #high contrast
rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
results = pytesseract.image_to_osd(rgb, output_type=pytesseract.Output.DICT)
print(results["rotate"])