I’ve been doing some experimentation with SVHN dataset, mainly I wanted to crop out each digit before doing some training when accidentally I noticed that image 53.png
from test dataset has wrong labels (9 and 3, instead of 3 and 3).
I’m curious if anyone can also replicate the problem and if the labels are really wrong maybe suggest what should be done with it?
I downloaded dataset from the official website, unzipped it and tried reading digitStruct.mat.
my code (assuming all the data is in data/train folder):
import os
from PIL import Image
from pymatreader import read_mat
import matplotlib.pyplot as plt
train_mat = read_mat('data/train/digitStruct.mat')
print(train_mat['digitStruct']['name'][52])
print(train_mat['digitStruct']['bbox'][52])
Returns
>>53.png
>>{'label': [9.0, 3.0], 'height': [84.0, 84.0], 'width': [59.0, 52.0], 'left': [160.0, 208.0], 'top': [34.0, 18.0]}
I also tried displaying other images in case there are similar anomalies but I haven’t noticed anything.
import cv2
from matplotlib import pyplot as plt
fig = plt.figure(figsize=(10, 7))
# Read the images using OpenCV (OpenCV loads images in BGR format)
image1 = cv2.imread('data/train/' + train_mat['digitStruct']['name'][0])
image2 = cv2.imread('data/train/' + train_mat['digitStruct']['name'][27])
image3 = cv2.imread('data/train/' + train_mat['digitStruct']['name'][52])
image4 = cv2.imread('data/train/' + train_mat['digitStruct']['name'][90])
# Convert the images from BGR to RGB format so Matplotlib can display them correctly
image1 = cv2.cvtColor(image1, cv2.COLOR_BGR2RGB)
image2 = cv2.cvtColor(image2, cv2.COLOR_BGR2RGB)
image3 = cv2.cvtColor(image3, cv2.COLOR_BGR2RGB)
image4 = cv2.cvtColor(image4, cv2.COLOR_BGR2RGB)
# Add the first image to the figure (top-left position)
plt.subplot(2, 2, 1) # 2 rows, 2 columns, first position
plt.imshow(image1)
plt.axis('off') # Hide the axis labels
plt.title(train_mat['digitStruct']['name'][0] + 'n' + str(train_mat['digitStruct']['bbox'][0]['label']))
# Add the second image to the figure (top-right position)
plt.subplot(2, 2, 2) # 2 rows, 2 columns, second position
plt.imshow(image2)
plt.axis('off') # Hide the axis labels
plt.title(train_mat['digitStruct']['name'][27] + 'n' + str(train_mat['digitStruct']['bbox'][27]['label']))
# Add the third image to the figure (bottom-left position)
plt.subplot(2, 2, 3) # 2 rows, 2 columns, third position
plt.imshow(image3)
plt.axis('off') # Hide the axis labels
plt.title(train_mat['digitStruct']['name'][52] + 'n' + str(train_mat['digitStruct']['bbox'][52]['label']))
# Add the fourth image to the figure (bottom-right position)
plt.subplot(2, 2, 4) # 2 rows, 2 columns, fourth position
plt.imshow(image4)
plt.axis('off') # Hide the axis labels
plt.title(train_mat['digitStruct']['name'][90] + 'n' + str(train_mat['digitStruct']['bbox'][90]['label']))
Output: SVHN comparison
nikola310 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2