I have a embedded device and a camera. I am trying to get Euler angles particularly Yaw angle to check if the person is looking at the camera or anywhere else. I have a object detection model to get the heads and I know I need to convert 2D to 3D for getting the Yaw angle. My question is can it be done just by knowing the bounding box locations of the person detected? Or do I need the facial landmarks to go on. In any case, I wanted to understand how this can be done if possible or are there any other methods to get the Yaw angle.