I have 20,000 plus images of art (paintings, sculptures, jars, etc) stored in a data base. The actual pieces are distributed in multiple warehouses. Ideally, the physical pieces SHOULD have a sticker (with its ID’s, QR code, etc), those stickers are made of paper, so the could suffer damaged, be poorly printed, unreadable, be completely missing or even mislocated. My goal is creating a model that receives an input (image sent by someone from any warehouse), identifies the exact same piece of art from the available data and returns its id, details, etc.
In my case, the sample is static, fixed (there will not be “new” art pieces, unless the client purchases more), so the model will not ever “see” new images, which makes me think that overfitting perhaps be the most desirable thing for the model to achieved (this translates to heavy data augmentation and high number of epochs).
Notice that there’s ONLY ONE image available per class (art piece). That’s the situation, which cannot change.
The selected programming language is R, mainly the tensorflow
and keras3
libraries.
That being said, it’s hard for me to find solutions since every documentation relies on either the same cat vs dogs
or mnist
dataset. My questions are:
- Is a siamese netowrk the right algorithm for this purpose?
- What approaches can I take in order to improve accuracy?
Just for testing purposes, I took a sample of 10 pieces, generated 9 others from each (data augmentation, applying rotation, vertical/horizontal flipping, random saturation factors, random brigthness factors, etc). Later, created 5 positive and 5 negative pairs per each class. Finally, I ran a siamese network but the accuracy seems to be stuck at 49%.
Romina Silvera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.