We’re using google OCR to read PDF or Images that are Loan Estimates.
We’re defining multiple fields such as loanTerm, loanPurpose
loanPurpose
but we’re also labeling multiple checkboxes that can be on the same page
like loanTypeConventional, loanTypeUSDA, etc
loanTypeConventional
or also rateLockNo and rateLockYes
rateLockNo
the problem is that, using pre-trained foundation models,
the AI is not detecting fields correctly.
as seen here, the AI is detecting loanTypeUSDA even when not present
I already have multiple PDFs in the dataset
having at least 10 documents with the label I defined
having 41 labels defined
but the OCR is still failing to process simple things as checkboxes.
what Im doing wrong ?
previously we were using Eve ai (formerly known as Butler ai) and it’s working way much better even with less examples (just 40-50), but the google OCR is so painful and hard to setup.
any recommendation? someone got Loan Estimates processed on this OCR?
1