How can I capture and process text flashing for 0.2 seconds on the screen?

I’m trying to build a Python script that can detect text that flashes on the screen for a very short period (around 0.2 seconds). I’m using mss for screen capturing and pytesseract for OCR. Below is the code I’m working with:

python
Copy code
import cv2
import pytesseract
import numpy as np
from mss import mss
import time
import threading

# Configure pytesseract path
pytesseract.pytesseract_cmd = "/opt/homebrew/bin/tesseract"

# Define the full-screen capture region
sct = mss()
monitor = sct.monitors[1]  # Full screen (adjust for multiple monitors)

# Initialize variables
last_detected_text = ""
detected_text_lock = threading.Lock()

# Function to capture the screen and process OCR
def capture_and_process_text():
    global last_detected_text
    while True:
        start_time = time.time()
        # Capture the screen
        screenshot = np.array(sct.grab(monitor))
        # Skip grayscale or thresholding for speed
        text = pytesseract.image_to_string(screenshot, lang="eng").strip()
        # Normalize the text to reduce noise
        normalized_text = " ".join(text.split())
        
        # Only display new and non-empty text
        with detected_text_lock:
            if normalized_text and normalized_text != last_detected_text:
                print(f"Detected Text: {normalized_text}")
                last_detected_text = normalized_text
        
        # Dynamically adjust loop timing
        end_time = time.time()
        print(f"Frame processed in {end_time - start_time:.5f} seconds")

# Run the optimized text capture loop
print("Starting full-screen text capture...")
capture_thread = threading.Thread(target=capture_and_process_text)
capture_thread.start()

try:
    while True:
        time.sleep(1)  # Keep the main thread alive
except KeyboardInterrupt:
    print("Text capture stopped.")

This works reasonably well for capturing and processing text on the screen, but I’m facing a few challenges:

Speed: Sometimes it seems like the script isn’t fast enough to catch very brief flashes of text, while I’m aiming around 0.2 seconds.
OCR Processing Overhead: pytesseract can be slow when dealing with full-screen images, and I’m wondering if there’s a way to make it faster.
Capturing All Text: Since I don’t know where on the screen the text will appear, I have to capture the entire screen, which adds overhead.
I’m looking for advice on how to optimize this code to make it faster and more reliable. Specifically:

Is there a way to make screen capturing faster while still processing the entire screen?
Are there any faster alternatives to pytesseract that work well for real-time OCR?
Any general tips for optimizing the capture-then-OCR workflow to handle such short flashes?
I’d appreciate any guidance or suggestions on how to approach this problem. Thanks in advance!

New contributor

Can you identify where or when the flashes happen–even if not with 100% accuracy–without doing full OCR on the whole screen?

Like, how about this:

Every detection interval (200ms or so), cheaply compare the current screen with the previous one to see if text might have appeared. False-positive results are okay but false-negative results are not. This must always reliably complete within your 200ms detection interval.

How you implement this depends on what the text looks like and what else is going on on the screen. For example, if the screen is otherwise static (no animations), you could just look to see if any pixels have changed. Or if you know the text or background will have specific colors, you could look for those.

When you think text might have appeared, copy that area of the screen into a queue. A separate thread will pick it up from there and do the expensive OCR part.

The OCR thread is allowed to take longer than your 200ms detection interval, as long as, on average, it’s able to process images from the queue faster than the quick-detection thread is inserting them.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: Kiến thức lập trình - @ 13:43

Thẻ: pythonocrtesseract

Thiết kế website giá rẻ

Danh mục

How can I capture and process text flashing for 0.2 seconds on the screen?