I’m currently working on fine-tuning LayoutLMv3 using the CORD-v2 dataset. I’m struggling with the data preprocessing part, specifically on how to correctly extract the total amount (TTC) from the images. The examples I’ve found online seem to use the older CORD dataset, which has a different format. The new CORD-v2 dataset only includes images and ground truth labels.
Has anyone worked with this dataset or has any idea on how to approach this? Any help would be greatly appreciated!
I’ve tried examples from YouTube and Hugging Face but haven’t had any success.
Thanks in advance!
I’ve tried examples from YouTube and Hugging Face but haven’t had any success.