I have a function in Python wich reads text in Portuguese with EasyOCR. For some reason it doesn’t always recognize the “e” between bigger words, wich is a common connector word in this language.
Is there any way in wich I could force EasyOCR to recognize the “e” as part of the phrase?
def getOCRResults(imgFP)->list[tuple[tuple[int,int,int,int]|None,str|None,float|None]]:
reader = easyocr.Reader(['pt'], gpu=False)
result = reader.readtext(imgFP, width_ths=0.7, add_margin=0.2, height_ths=0.8)
return result
def draw_bounding_boxes(image, detections, threshold=0.25):
for bbox, text, score in detections:
if score > threshold:
cv2.rectangle(image, tuple(map(int, bbox[0])), tuple(map(int, bbox[2])), (0, 255, 0), 1)
cv2.putText(image, text, tuple(map(int, bbox[0])), cv2.FONT_HERSHEY_SIMPLEX, 0.35, (255, 0, 0), 1)
OCRResults = getOCRResults(image_path)
img = cv2.imread(image_path)
draw_bounding_boxes(img,OCRResults,0.25)
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGBA))
plt.show()
Part of image where letter “e” not recognized as text when alone.
Another area in the same image where it recognized the “e” as part of text in some of the options (2º and 4º). The 1º e 3º options in the combobox also show that the “e” hasn’t been recognized.