How do I refine OCR in an image such as this one in order to make EVERY number extracted?
I used EasyOCR for this.
import cv2
import matplotlib.pyplot as plt
import numpy as np
import easyocr
image = cv2.imread(image_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
inverted = cv2.bitwise_not(gray)
rescaled = cv2.resize(inverted, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_LINEAR)
thresh = cv2.threshold(rescaled, 0, 255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 2))
morphed_image = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
eroded = cv2.erode(morphed_image, kernel, iterations=1)
final_image = cv2.dilate(eroded, kernel, iterations=1)
This is the preprocessing I did.
reader = easyocr.Reader(['en'],gpu=False)
result = reader.readtext(final_image,width_ths=0.1)
For some reason it just doesn’t detect every number in the plot. I highlighted the bounding boxes of detected text and this is the output that I got :
Not all numbers get detected.
There are 2 sets of numbers I specifically want:
-
The numbers above each vertical line of the graph (excluding y axis of the graph)
-
All the numbers below X axis that represent it’s scale.
I have the co-ordinates to X and Y axis of the graph which I already found out through Hough transform and so I can create 2 regions of interest:
-
On the right of Y-axis and top of X-axis
-
Below the X axis
However, all numbers don’t get detected. This is the output I got for Region of interest 1 – to find all numbers above the vertical lines: