I have a model with three layers, each layer having 2048 nodes. The input data has a size of 2048, and I want to perform the training using 5×10^6 (5000000) samples (making the training data size 5000000×2048).
I am using Google Colab Pro, but it takes a long time for each epoch. What are the possible solutions to make the training time shorter? Has anyone compared Google Colab Pro and Google Colab Pro+ and found significant differences in running such data as mentioned above? Does anyone have any suggestions to improve the training time?
Thank you.