How to Extract Data from SVG-Based Map with PNG Overlays on a Web App?
I’m working on a project that requires scraping data from a web app where the relevant information is displayed on a map. Here’s the detailed scenario:
- The map data is embedded within an SVG.
- The information I need appears when certain options are toggled on the web app.
- The data is displayed as PNG overlays on the SVG map.
- The PNGs represent specific locations calculated by the web app.
My initial approach involved taking screenshots and using Pytesseract OCR to extract the information. However, this method is impractical for processing around 5 million entries due to poor OCR accuracy and irrelevant data being captured.
Here is the link to the webpage solve the recaptcha, then click on Bureau InfoLot
Could you suggest a more efficient way to programmatically extract this data with python ?
Here are some images related,
Sample
the preview response,
Sample2
My initial approach involved taking screenshots and using Python Tesseract OCR to extract the information. However, this method is impractical for processing around 5 million entries due to poor OCR accuracy and irrelevant data being captured.
I expect and alternative approach for better Accuracy