I am using VertexAI to train for Object Detection. I have created by data set as per documentation. I am using 1000x1000px images for training data. Then I export the Edge ML model as a .tflite
file and import to my project.
My React Native app uses a camera from RNVision Camera. I also installed the TF Lite package. The app is functional so it seems like it’s setup right. However, my bounding box on the screen is not what I expect, as the Skia
bounding box is out of screen view as you can see:
Here is what I am doing with the frameProcessor
:
const frameProcessor = useSkiaFrameProcessor(
frame => {
'worklet';
if (!modelFile?.runSync) {
return frame;
}
frame.render();
runAtTargetFps(30, () => {
// 1. Resize 4k Frame to match model reqs vision-camera-resize-plugin
const resized = resize(frame, {
scale: {
height: 512,
width: 512,
},
pixelFormat: 'rgb',
dataType: 'uint8',
});
const done = modelFile.runSync([resized]);
const [boundingBoxes, labels, scores] = done;
const top = boundingBoxes[0] as number;
const left = boundingBoxes[0 + 1] as number;
const bottom = boundingBoxes[0 + 2] as number;
const right = boundingBoxes[0 + 3] as number;
// Function to add a new position
function addPosition() {
const newPosition = { top, left, bottom, right };
// If the array is already at max length, remove the oldest item
if ((positions.value?.length || 0) >= maxLength) {
const copy = [...(positions.value || [])];
copy.shift();
positions.value = copy;
}
// Add the new position to the end of the array
positions.value = [...(positions.value || []), newPosition];
}
if (scores[0] >= 0.7) {
addPosition();
const position = calculateAveragePosition();
box.value = position;
labelText.value =
objectDetectMap[labels[0] as keyof typeof objectDetectMap];
} else {
box.value = null;
}
});
if (box.value) {
const rect = Skia.XYWHRect(
// without multiplier, box is tiny & off screen
box.value.left * frame.width,
box.value.top * frame.height,
(box.value.right - box.value.left) * frame.width,
(box.value.bottom - box.value.top) * frame.height
);
const rectPaint = Skia.Paint();
rectPaint.setStyle(PaintStyle.Stroke);
rectPaint.setStrokeWidth(20);
rectPaint.setColor(Skia.Color('red'));
frame.drawRect(rect, rectPaint);
if (font) {
const fontPaint = Skia.Paint();
fontPaint.setColor(Skia.Color('red'));
const text = Skia.TextBlob.MakeFromText(
labelText.value || 'Move Closer',
font
);
frame.drawTextBlob(
text,
box.value.left * frame.width,
box.value.top * frame.height,
fontPaint
);
}
}
},
[modelFile, box, calculateAveragePosition, sharedLoadingModel]
);
My frame
is 1920×1080. I also tried a smaller resolution below the 1024 image max that is mentioned in the Vertex documentation, however with the resize
plugin to a 512x512 RGB
I didn’t think the frame itself matters?
What is wrong with my logic?