is it possible to train gpt2 on 1.5m datapoints on the colab or jupyter or kaggle?
Till now I tried doing it in colab but it runs out of storage while tokenization process which was understandable. I also tried batching techniques. Later on I was trying to run same algorithm on kaggle but currently its showing error with loading the transformer. Trying to run it still. I just wanted to know is it possible to do this!
New contributor
Cyber_blip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.