This is the original image:
This is the processed image:
I’m trying to automate a mini-game, in which characters appear on the screen. I did some light reaserch and managed to process the image to what you can see above, but the it dosen’t seem to work correctly. This code returns a single character ‘Q’. Is there anyway to do this?
I’m using version 5.4.0
Thanks in advance
My code:
import pytesseract
import cv2
import numpy as np
pytesseract.pytesseract.tesseract_cmd = r'C:Users***AppDataLocalProgramsTesseract-OCRtesseract.exe'
image = cv2.imread('ocrtest.png', cv2.IMREAD_GRAYSCALE)
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)
cv2.imshow('test', thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()
data = pytesseract.image_to_string(thresh, lang='eng', config='-c tessedit_char_whitelist=01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ --psm 6 --oem 3')
print(data)
I tried different all the simple tresholding methods and Otsu’s Binarization from this doc. It resulted in poor image quality and it basically also didn’t work. I settled on the addaptive thresholding, because it looks best in my opinion, but I’m not sure since I don’t really know how it works.
I tried every other psm option, but they also didn’t work and I settled on 6, because it at least gives me something. Wierdly I thought that 11 would be best acording to this description, but it returned nothing.
11 Sparse text. Find as much text as possible in no particular order.
Flako is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
Your preprocessing code is doing a pretty poor job of isolating the letters. They are well separated in the Red channel, so maybe more like this:
import cv2 as cv
# Load image
im = cv2.imread('letters.png')
# Operate on Red channel
red = im[..., 2]
_, thresh = cv2.threshold(red, 180, 255, cv2.THRESH_BINARY_INV)
If the original image is not properly representative, but may consist of differently coloured backgrounds, you could convert to HSV mode, and look for the white, unsaturated letters (in the “Saturation” channel) instead of segmenting on the Red channel. That is more like this:
import cv2 as cv
# Load image
im = cv2.imread('letters.png')
# Convert to HSV colourspace and select "Saturation" channel
hsv = cv.cvtColor(im,cv.COLOR_BGR2HSV)
s = hsv[...,1]
# Find unsaturated pixels - i.e. white/black or uncoloured
_, thresh = cv2.threshold(s, 40, 255, cv2.THRESH_BINARY)
5