Relative Content

Tag Archive for machine-learningpytorchsimulationdistributed-computing

How to develop multi-GPU modules in single-node single-GPU system in pytorch?

I’m developing a multi-GPU PyTorch application. Existing methods like scatter/gather in torch.distributed don’t fulfill my requirements, therefore I need to develop forward/backprop steps which send and receive gradients across the GPUs, while using built-in methods scatter/gather. I can do it myself. My final application will be executed on the multi-GPU server.