As part of a project I’m working on, I need to find a camera location using solvePnP.
The project (OpenGL) goes as follows:
- Create a 3d world (using triangles) which is a height map of some photo.
- Travel in this world.
- At a certain scene of your choice, pick 6 3d points and calculate their matching 2d points.
- Calculate where was the camera when you picked those points.
NOTES:
- There is only 1 OpenGL view, split into 2 such that the left side is the global view, and the right side is the camera view.
- We’re using color picking in order to map between triangles (center point) and a unique id (color).
- Our color picking works just fine.
- We’re struggling to find the location of the camera using solvePnP. (Stage 4)
Relevant code:
-
Picking the 6 points in a scene:
`
double xpos, ypos;
glfwGetCursorPos(window, &xpos, &ypos);int framebufferWidth, framebufferHeight; glfwGetFramebufferSize(window, &framebufferWidth, &framebufferHeight); int windowWidth, windowHeight; glfwGetWindowSize(window, &windowWidth, &windowHeight); int pixelX = static_cast<int>(xpos * framebufferWidth / windowWidth); int pixelY = framebufferHeight - static_cast<int>(ypos * framebufferHeight / windowHeight) - 1; // Only process picking if the click is on the right half of the screen if (pixelX >= framebufferWidth / 2) { // Adjust pixelX to be relative to the right viewport pixelX -= framebufferWidth / 2; //find the color of the pixel, and retrieve the ID. //with this ID, find the matching triangle //if found : imagePoints.push_back(glm::vec2(pixelX, pixelY)); objectPoints.push_back(point); }
`
- Trying to compute using solvePnP:
`
void computeCameraPose(GLFWwindow* window)
{
if (imagePoints.size() < 4 || objectPoints.size() < 4) {
std::cout << “Not enough points to compute camera pose. Need at least 4 points.” << std::endl;
return;
}
// Convert image points to OpenCV format
std::vector<cv::Point2f> cvImagePoints;
int width, height;
glfwGetFramebufferSize(window, &width, &height);
for (const auto& point : imagePoints) {
// Convert from OpenGL coordinates to OpenCV coordinates
float x = point.x + width/2;
float y = height - point.y; // Flip Y-coordinate
cvImagePoints.push_back(cv::Point2f(x, y));
}
// Convert object points to OpenCV format
std::vector<cv::Point3f> cvObjectPoints;
for (const auto& point : objectPoints) {
// Convert from OpenGL (right-handed) to OpenCV (right-handed) coordinate system
// OpenGL: +Y is up, +Z is backwards
// OpenCV: +Y is down, +Z is forwards
cvObjectPoints.push_back(cv::Point3f(point.x, -point.y, -point.z));
}
cv::Mat cameraMatrix = (cv::Mat_<double>(3,3) <<
width, 0, width/2,
0, width, height/2,
0, 0, 1);
cv::Mat distCoeffs = cv::Mat::zeros(4, 1, cv::DataType<double>::type);
cv::Mat rvec, tvec;
cv::solvePnP(cvObjectPoints, cvImagePoints, cameraMatrix, distCoeffs, rvec, tvec);
cv::Mat ZYX;
cv::Rodrigues(rvec, ZYX);
std::cout << "Rodrigues" << std::endl;
std::cout << ZYX << std::endl;
std::cout << "done with Rodrigues" << std::endl;
// Form the 4x4 transformation matrix
cv::Mat totalrotmax = (cv::Mat_<double>(4, 4) <<
ZYX.at<double>(0, 0), ZYX.at<double>(0, 1), ZYX.at<double>(0, 2), tvec.at<double>(0),
ZYX.at<double>(1, 0), ZYX.at<double>(1, 1), ZYX.at<double>(1, 2), tvec.at<double>(1),
ZYX.at<double>(2, 0), ZYX.at<double>(2, 1), ZYX.at<double>(2, 2), tvec.at<double>(2),
0, 0, 0, 1);
// Invert the transformation matrix
cv::Mat inverserotmax;
inverserotmax = totalrotmax.inv();
`
However, I’m getting really far off results.
Thank you in advance for any help.
Have tried searching in the internet, including LLM models.