Can some one please tell me what happens under the hoods when you defines some network in PyTorch and actually trains it?
class SimpleNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleNN, self).__init__()
self.layer1 = nn.Linear(input_size, hidden_size)
self.layer2 = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = torch.relu(self.layer1(x))
x = self.layer2(x)
return x
# then I train my network
model = SimpleNN(input_size, hidden_size, output_size)
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=lr)
model.train()
optimizer.zero_grad()
x = torch.tensor(x_train, dtype=torch.float32).view(-1,1)
target = torch.tensor(y_train, dtype=torch.float32).view(-1,1)
output = model(x)
loss = criterion(output, target)
loss.backward()
optimizer.step()
my question is: how does this code build the computational graph and perform autograds, also when you calls .cuda()
, what part of the pytorch is responsible for moving data to GPU?
I’ve done some research but found nothing helpful.
I know that Python is an interpreted language, and that PyTorch build Computational graph dynamically, that is in every forward-pass there’s a new graph generated.
so are there any architectural post like this one:
https://www.geeksforgeeks.org/compilation-execution-java-program/
many thanks!