I am running VNRecognizeTextRequests on the single captured frames of an ARSession.
As a result I get the recognized textblocks and their bounding boxes.
I want to display these bounding boxes in the ARView as a RealityKit-Mesh.
To do so I perform a raycast from the centerpoint of the bounding box and place an AnchorEntity with ModelEntity as child at the first intersection of the ray.
The MeshResource.generatePlane
method requires width and height of the resulting rectangle in metres, how do I calculate these dimensions from the size of the bounding box in screen coordinates?
Here’s some of my relevant code:
Handling recognized textblocks
let transform = visionTransform(frame: arView.session.currentFrame!, viewport: UIScreen.main.bounds)
...
for observation in observations {
if let recognizedText = observation.topCandidates(1).first?.string {
let imageRect = observation.boundingBox.applying(transform)
self.highlight(rect: imageRect)
}
}
Transforming Vision coordinates to ARKit coordinates
private func visionTransform(frame: ARFrame, viewport: CGRect) -> CGAffineTransform {
let orientation = UIApplication.shared.statusBarOrientation
let transform = frame.displayTransform(for: orientation,
viewportSize: viewport.size)
let scale = CGAffineTransform(scaleX: viewport.width,
y: viewport.height)
var t = CGAffineTransform()
if orientation.isPortrait {
t = CGAffineTransform(scaleX: -1, y: 1)
t = t.translatedBy(x: -viewport.width, y: 0)
} else if orientation.isLandscape {
t = CGAffineTransform(scaleX: 1, y: -1)
t = t.translatedBy(x: 0, y: -viewport.height)
}
return transform.concatenating(scale).concatenating(t)
}
Displaying bounding boxes in ARView
private func highlight(rect: CGRect) {
guard let raycastResult = arView.raycast(from: CGPoint(x: rect.midX, y: rect.midY), allowing: .estimatedPlane, alignment: .any).first else {
return
}
let anchor = AnchorEntity(raycastResult: raycastResult)
let plane = ModelEntity(mesh: MeshResource.generatePlane(width: 0.1, height: 0.05), materials: [SimpleMaterial(color: .yellow.withAlphaComponent(0.5), isMetallic: false)])
anchor.addChild(plane)
arView.scene.addAnchor(anchor)
}