I am trying to extract the text from an image using OCR. The challenge I am facing is how to map the key and value. For example, the key – Last Name should have the value XYZ. After getting the key value pair, I need to draw a bounding box around the identified key value text.
One way to do this is the following.
You can get the position of the bounding boxes using pytesseract and the method pytesseract.image_to_data(image, config, output_type="data_frame")
.
This will return a dataframe with the position of each detected word. You can then group them by position to associate the key and the value.
You can then do something like this:
import pytesseract
from pytesseract import Output
import cv2
img = cv2.imread('image.jpg')
df = pytesseract.image_to_data(img, output_type="data_frame")
n_boxes = len(df['text'])
for i in range(n_boxes):
(x, y, w, h) = (df['left'][i], df['top'][i], df['width'][i], df['height'][i])
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.imshow('img', img)
cv2.waitKey(0)