Tensorflow/Keras Transformer struggles to predict the last position in a sequence, but does well in all the other positions
I am using a Transformer for next frame prediction. Every frame has been previously encoded into a 1D latent vector using a VAE (the encoding-decoding from the VAE is very good).