`
for _ in range(int(args[“num_train_epochs”])):
for step, batch in enumerate(train_dataloader):
model.train()
inputs = {“input_ids”: batch[0].to(args[“device”]),
“attention_mask”: batch[1].to(args[“device”]),
‘start_positions’: batch[2].to(args[“device”])}
# ‘p_mask’: batch[4].to(args[“device”])}
outputs = model(**inputs)
loss = outputs.loss
if args["n_gpu"] > 1:
loss = torch.mean(loss)
if args["gradient_accumulation_steps"] > 1:
loss = torch.mean(loss / args["gradient_accumulation_steps"])
else:
loss = torch.mean(loss)
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), args["max_grad_norm"])
tr_loss += loss.item()
if (step + 1) % args["gradient_accumulation_steps"] == 0:
optimizer.step()
scheduler.step()
model.zero_grad()
global_step += 1
if args["local_rank"] in [-1, 0] and args["save_steps"] > 0 and global_step % args["save_steps"] == 0:
output_dir = os.path.join(args["output_dir"], 'checkpoint-{}'.format(global_step))
if not os.path.exists(output_dir):
os.makedirs(output_dir)
model_to_save = model.module if hasattr(model, 'module') else model
model_to_save.save_pretrained(output_dir)
torch.save(args, os.path.join(output_dir, 'pytorch_model.bin'))
print("Saving model checkpoint to %s" % output_dir)
if args["max_steps"] > 0 and global_step > args["max_steps"]:
break
if args["max_steps"] > 0 and global_step > args["max_steps"]:
break
return global_step, (tr_loss / global_step) if global_step > 0 else tr_loss
`
im trying to fine tune Xlm roberta for context based question and answering. I custom my loss function but there’s something wrong wth the loss alogrithm. Please help me! im set learning rate 3e-5
Minh Trần Tuyết is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.