I have 10-100 token inputs and ~1000 token outputs, so I have separate embeddings for them so I can have computational efficiency. I’m pretty sure the answer is that there’s no way to have differently lengthed inputs and outputs in graph execution. But I know this isn’t the case because at multiple steps throughout my project I’ve had the code work fine. So maybe it’s a new version or something?
Running in eager execution works just fine:
((TensorSpec(shape=(None, 20), dtype=tf.int32, name=None), TensorSpec(shape=(None, 100), dtype=tf.int32, name=None)), TensorSpec(shape=(None, 100), dtype=tf.int32, name=None))
Running in graph mode it won’t create the dataset:
---> 39 dataset = tf.data.Dataset.from_tensor_slices((list(problems), list(solutions), list(solutions)))
ValueError: Expected values [array([27, ..., 16], dtype=int32)] to be a dense tensor with shape [20], but got shape [20, 20].
I’ve created a minimal version in Colab:
https://colab.research.google.com/drive/1tfk518PwmrJEapxIQlOuiLmiTSfxREk3?usp=sharing
Caveat: It could be that graph execution skips a step it deems unnecessary – and maybe it really does want the same sized inputs?