Relative Content

Tag Archive for javascriptnode.jsjsonexpresspdf

Convert pdf data into json that has tables also in node js

I'm trying to convert PDF data, including tables, into structured JSON format using Node.js. I've experimented with libraries like pdf-parser, pdf-reader, and pdf2json`, but the results haven’t been ideal. The table extraction is inaccurate, the JSON lacks a clear object structure, and I’m having trouble identifying column names or handling empty cells.
Ideally, I’d like to obtain JSON that represents the table structure with objects for each cell, row, and column. This would also allow me to identify column names and deal with empty cells gracefully.