Relative Content

Tag Archive for pythontensorflowpytorchgpt-2

I’m fine-tuning a small GPT model but I can’t get it to use my GPU

I’m currently fine-tuning a GPT-2 distilled model with approximately 130M parameters on 8 years’ worth of my WhatsApp chats. I’ve prepared a labeled dataset consisting of around 1 million sequences, each with 512 tokens. Below is the portion of code I am using for training:

I’m fine-tuning a small GPT model but I can’t get it to use my GPU

I’m currently fine-tuning a GPT-2 distilled model with approximately 130M parameters on 8 years’ worth of my WhatsApp chats. I’ve prepared a labeled dataset consisting of around 1 million sequences, each with 512 tokens. Below is the portion of code I am using for training: