I want to write a little C++ program that splits a PDF file according to the table of contents (TOC). For this I need to know the title of each TOC item and the page it starts (and ends) at. I heard that poppler is used for the same purpose in PDF-readers (e.g. Okular), so I opted to use the popper-cpp interface. However, I find the documentation to lack any specific information on retrieving a full TOC hierarchy including page numbers. Is this possible to do via the poppler-cpp API or should I use something else?
I have managed to get a listing of the entire TOC (without the hierarchical structure) and the TOC item titles. Here is a minimal example:
#include <iostream>
#include "poppler-document.h"
#include "poppler-toc.h"
using namespace std;
void print_toc(poppler::toc_item *item){
cout << item->title().to_latin1().c_str() << endl; // print toc item title
// TODO: print page number
for (poppler::toc_item* &i: item->children())
print_toc(i);
}
int main(int argc, char* argv[])
{
poppler::document *doc = poppler::document::load_from_file(argv[1], "", "");
poppler::toc *toc = doc->create_toc();
print_toc(toc->root());
}