What am I Doing?
I am building an application that takes feed from an external webcam and maps that feed to a UIView covering the entire screen.
This application also needs to track facial features, namely how open the user’s mouth is, and pass that data across a USB connection to a receiving device. This value for how open the user’s mouth is is used to update the height of a mouth asset on the receiving device. If the user of the app opens their mouth, the face displayed on the receiving device also opens its mouth, etc.
I am currently able to do all of this by using the Vision API in conjunction with AVCaptureMultiCamSession
. Here you can see an example, with a preview of the mouth tracking data that will be sent to the receiving device in the bottom right corner.
Vision’s VNFaceObservation.landmarks?.innerLips
data basically gives you a hexagon of the user’s mouth.
We then calculate the “height” of the mouth via:
let points: [CGPoint] = [ /* add some points here */ ]
let pointYs = points.map { point in
point.y
}
if let maxY = pointYs.max(), let minY = pointYs.min() {
let mouthHeight: Double = (maxY - minY)
// multiply mouthHeight by some factor, serialize, and pass across USB connection
}
The Issue
The issue is that this approach makes the mouthHeight
very jumpy, i.e. the mouth asset, whose height the data is mapping to, flickers a lot because there is not a smooth moving average of mouth heights. Note that I was not moving my mouth in this below example, yet the UI is doing this flickering.
Using ARKit would be much more applicable for this use case. When I look at how an Animoji mimics the user’s mouth “openness”, I don’t observe this flickering.
What I’ve Tried
-
I have explored using
ARSCNView
whose.session
is running anARWorldTrackingConfiguration
withuserFaceTrackingEnabled
= true
as seen in this example. The problem with this approach is that, while this will show me the external world while also giving me ARFaceAnchor data, I am fairly certain that there is no way to change theARWorldTrackingConfiguration
‘s camera to be an external camera. -
I have tried removing the Vision framework entirely and use an
ARSCNView
whose .session is running anARFaceTrackingConfiguration
while continuing to get the external webcam feed via
let multiCamSession = AVCaptureMultiCamSession()
let deviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [.external], mediaType: .video, position: .unspecified)
if let device = deviceDiscoverySession.devices.first {
if let deviceInput = try? AVCaptureDeviceInput(device: device) {
captureSession.addInput(deviceInput)
}
}
but, unfortunately, I cannot get both the external webcam feed and the selfie camera feed to display at the same time in their respective UIView
and ARSCNView
. It seems that the ARSCNView
just makes the feed from the external camera and AVCaptureMultiCamSession
stop once it loads its camera.
What I am wondering
What paths forward do I have to use ARKit’s face tracking while also having an external webcam feed? Consider:
A. Is my assumption that “there is no way to change the ARWorldTrackingConfiguration
‘s camera to be an external camera” true?
B. Is there a way to use ARSCNView
or any other means of grabbing face tracking data via ARKit from the device’s selfie cam while also seeing the external camera feed, via an AVCaptureMultiCamSession
or otherwise?