I’m at the end of my rope here. I’m trying to train a custom object detection model, specifically for face detection. I’m trying to use the WIDER FACES dataset.
Initially I was trying to use the TensorFlow Detection API, but I found too much conflicting/out of date info. So I switched to the Kerasv3 Object Detection system. I like the way its going, but loading data into a compatible dataset is driving me crazy. I’ve got code that generates a dataset from the downloaded WIDER faces dataset. The gist is that I can find the X, Y, Width, and Height parameters, as well as the number of total faces, for any given image in the dataset. I can load the image into a tensor, I can express the bounding boxes and “classes” (in quotes only because there’s only one class).
But I’ve tried to manipulate this data in many ways, but I can’t make it work with the visualize data function in the guide above. For example, my current code below throws an error when converting the bounding box coordinates to an eager tensor. With the information i can extract, how should I prepare the dataset so that Keras will accept it?
def convert_to_tf_dataset(data_dict):
images = []
bboxes = []
classes = []
for i in range(len(data_dict['images'])):
img = data_dict['images'][i]
images.append(img)
bboxes.append(data_dict['bounding_boxes']['boxes'])
classes.append(data_dict['bounding_boxes']['classes'][i])
bboxes = tf.ragged.constant(bboxes, dtype=tf.float32)
classes = tf.ragged.constant(classes, dtype=tf.int32)
images = tf.ragged.constant(images, dtype=tf.uint8) # Assuming images are uint8 type
dataset = tf.data.Dataset.from_tensor_slices({"images": images, "bounding_boxes": {'boxes': bboxes, 'classes': classes}})
return dataset
train_ds = generate_dataset(TRAIN_BBX_FILE, TRAIN_IMAGES_DIR)
tf_dataset = convert_to_tf_dataset(train_ds)
BATCH_SIZE = 4
tf_dataset = tf_dataset.ragged_batch(BATCH_SIZE, drop_remainder=True)