I’m trying to create a vision transformer model for my personal project using PyTorch.
The problem is when I’m running my testing codei, I’m not sure if I’m correctly calculating the training loss, validation loss, training accuracy and testing (+ top 2 testing) accuracy.
This is my code:
criterion = nn.CrossEntropyLoss()
optimizer = AdamW(vit.parameters(), lr=LEARNING_RATE, weight_decay=WEIGHT_DECAY)
# Training Loop
train_losses, val_losses, accuracies, top2_accuracies= [], [], [], []
training_start = time.time()
for epoch in range(NUM_EPOCHS):
log_str = write_and_print_str(log_str, f"EPOCH [{epoch+1}/{NUM_EPOCHS}]")
start = time.time()
vit.train()
running_loss = []
for images, labels in train_dataloader:
images, labels = images.to(device), labels.to(device)
optimizer.zero_grad()
outputs = vit(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss.append(loss.item())
avg_train_loss = sum(running_loss) / len(running_loss)
train_losses.append(avg_train_loss)
log_str = write_and_print_str(log_str, f'Loss: {avg_train_loss}')
vit.eval()
val_loss = []
correct = 0
top2_correct = 0
total = 0
with torch.no_grad():
for images, labels in test_dataloader:
images, labels = images.to(device), labels.to(device)
outputs = vit(images)
loss = criterion(outputs, labels)
val_loss.append(loss.item())
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
# Calculate top-2 accuracy
top2_pred = torch.topk(outputs, 2, dim=1).indices
top2_correct += (top2_pred == labels.unsqueeze(1)).sum().item()
end = time.time()
avg_val_loss = sum(val_loss) / len(val_loss)
accuracy = 100 * correct / total
top2_accuracy = 100 * top2_correct / total
val_losses.append(avg_val_loss)
accuracies.append(accuracy)
top2_accuracies.append(top2_accuracy)
log_str = write_and_print_str(log_str, f'Validation Loss: {avg_val_loss}, nAccuracy: {accuracy}%,nTop-2 Accuracy: {top2_accuracy}%nTime:{round(end-start, 2)}nn')
training_end = time.time()
log_str = write_and_print_str(log_str, f'Training Duration: {round(training_end - training_start, 2)}n')
print("EPOCHS were saved to the file successfully")
I ran it for more than 20 times and logged all of the runs in an extensive way.
In all of the results the validation loss started over 1 and reduced down to 0.20. But the problem is in these cases my accuracy was around 90% so I assume that I’m making something wrong in my code related to calculations.
To give more detail about the numerical valus of accuracy and loss, here are some of my EPOCH results
EPOCH [1/50]
Loss: 1.4692728799123032
Validation Loss: 1.1625839814995274,
Accuracy: 49.58932238193019%,
Top-2 Accuracy: 75.77002053388091%
Time:29.71
EPOCH [10/50]
Loss: 0.1079550055715327
Validation Loss: 0.5106942771059094,
Accuracy: 83.26488706365502%,
Top-2 Accuracy: 97.53593429158111%
Time:25.86
EPOCH [20/50]
Loss: 0.037730065656293076
Validation Loss: 0.4059527646185774,
Accuracy: 89.52772073921972%,
Top-2 Accuracy: 97.94661190965093%
Time:26.12
EPOCH [30/50]
Loss: 0.00011380775267753052
Validation Loss: 0.22308006276955095,
Accuracy: 94.6611909650924%,
Top-2 Accuracy: 99.48665297741273%
Time:24.41
EPOCH [40/50]
Loss: 3.5449059315886606e-05
Validation Loss: 0.23672400451808548,
Accuracy: 94.76386036960986%,
Top-2 Accuracy: 99.48665297741273%
Time:25.46
EPOCH [50/50]
Loss: 1.367992779425829e-05
Validation Loss: 0.24671741761443572,
Accuracy: 94.6611909650924%,
Top-2 Accuracy: 99.48665297741273%
Time:25.66
I hope that you can help me with this. I just need clarification