I believe that mesh rendering should not be a very expensive operation, since games do that very efficiently. Yet, works I am aware of are doing mesh rendering as a preprocessing step:
The state of the art:
- Given: Mesh dataset with texture, pre-defined camera parameters (intrinsics matrix) and world-to-camera transformations (extrinsics matrix)
- Pre-processing: In an offline setting, looping through the mesh dataset, render images & depth maps & foreground maps and save them with their respective camera parameters.
- Training: Use
torch.utils.data.Dataset.getitem()
to load these parameters.
Problems:
- Disk space.
- We cannot render every view.
- Some libraries require GPU utilization to do the rendering. We cannot afford to occupy GPU with this preprocessing step.
How I wish it to be:
Instead, handle the mesh rendering within the pytorch dataloader worker process. Specifically, I am looking for something in the spirit of:
class MeshRenderingDataset(torch.utils.data.Dataset):
...
def __getitem__(self, idx):
loaded_mesh = load_mesh(self.mesh_paths[idx])
# Lets say all camera intrinsics are the same and defined in init
extr = self.get_random_pose_around_mesh() # This method should define the interval for random extr.
rgb, depth, mask = loaded_mesh.render(self.intr,extr)
return rgb, depth, mask, self.intr, extr
Any ideas on how to achieve load_mesh
and loaded_mesh.render()
efficiently, such that the pytorch workers can handle this task? If you think this is not possible, an explanation would be also much appreciated!