I’m trying to implement a semantic segmentation project using the Deeplab v3 model from the pytorch-library. This model expects input-images of size 520×520.
I want to take my images and split them into patches with the size 520×520, put them through the model and then reform them into an image that I can compare to a pixelmask.
However every code that I’ve found has the same problem: If my image height or width isn’t perfectly divisible by 520, all the pixels that can’t be put into a 520×520 patch at the end just get discarded.
So for example if I have a image with a width of 1920 pixels I get 3 patches of width 520 and 360 leftover pixels are discarded. The same goes for height.
I’d prefer if my code made the last patch overlap with the surrounding patches so that I could make a prediction on every pixel of my image.
I’ve tried using the patchify-library and torch.unfold but both had the same output and just discarded the leftover pixels.