I am trying to get this code to work and then use it to train various models on two gpu’s:
from dask_cuda import LocalCUDACluster
from dask.distributed import Client
if __name__ == "__main__":
with LocalCUDACluster(n_workers=2) as cluster:
with Client(cluster) as client:
...
The error start with
- distributed.worker - ERROR - Worker plugin CPUAffinity-bd2c98dd-df91-4dc5-8dd9-cca207e4c3fc failed to setup
I am on the server provided by the university and it keeps giving me an error on the cpu affinity setting. If I set only one worker it works normally. Even if I use LocalCluster, though using that I have the problem that I will have two workers on one gpu.
Danilo Caputo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.