I’m currently working on a project where I need to extract individual images from scanned album pages with extreme precision. Each scan is at a resolution of 5792×5792.
enter image description here
Here’s my current workflow:
Using YOLOv9 object detection (trained at size 640) to detect and crop individual images.
Using YOLOv8 segmentation (trained at size 640) to extract the images from the cropped images.
enter image description here
While the results are decent, I am aiming for highly accurate masks and facing some issues:
Some parts of the images are missing in the mask.
Parts of the background are included in the mask.
The masks are not smooth and appear choppy.
enter image description here
I have also tried using Mask RCNN, but the results were slightly worse compared to YOLOv8 and YOLOv9 Segmentation. Additionally, I attempted mask refinement techniques, but they ended up degrading the mask quality.
I’m looking for suggestions on any preprocessing, post-processing, or potentially a different approach that could help me achieve my goal with very high precision.
Any advice or recommendations would be greatly appreciated!
Thank you in advance!
I have also tried using Mask RCNN, but the results were slightly worse compared to YOLOv8 and YOLOv9 Segmentation. Additionally, I attempted mask refinement techniques, but they ended up degrading the mask quality.
I’m looking for suggestions on any preprocessing, post-processing, or potentially a different approach that could help me achieve my goal with very high precision.
Any advice or recommendations would be greatly appreciated!
Thank you in advance!
saadahmad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.