Image link: (https://i.sstatic.net/bZPAWbSU.png)
The image is made of several subheaders, each having a different span.
I am unable to properly extract the table as I do not know the horizontal span of the table. This is affecting the way that I am reading/extracting information from different tables.
To note that I only have the pdf, I do not have the HTML code for it nor the colspan.
I tried using a anything that might be given from Camelot or PyMuPDF but I was never able to get the proper spread of a certain header.
Even table-transformers seems to be unable to solve the issue.
Does anyone has any experience or a recommendation for me to try whether it is a library, an approach or even a model.