I am using GCP as the cloud platform and have a standard cluster running on it. It has 2 GPU Nodepools with Nvidia T4 GPU in it. Now I want to setup HPA based on CPU and GPU’s RAM. I have tried many options but none of them seems to work for me.
e.g. I am trying to setup Nvidia DCGM exporter using the official documentation present in the github page. When I run that, the pods keeps on getting restarted without completing even a single time. Also helm doesn’t seem to work properly for me here. I get so many options but every option is either 3 years old or those steps doesn’t really work for me.
If anyone has any idea on this, it would be a lot useful for me.
I have tried almost every blog I saw on the internet which had title that goes like “HPA based on GPU” but nothing really worked for me.
Durwankur Naik is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.