I have been trying to extract data from an unstructured “CIOMS” looks likeform.
I am trying paddle ocr – i have been successful to extract texts, but i am not able to make meaningful clusters out of the texts.
I have tried pymupdf to extract texts and tried grouping them – the texts were way too scattered to be grouped.
What could be the possible appraches to extract meaningful clusters from a form after reading it through paddle ocr
Mandvi Shukla is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.