Does anyone know why this Dot operator that does a matrix multiply with input shapes [3,5], [5,6], can be serialized, but not deserialized?
import numpy as np
import tensorflow as tf
M = 3
K = 5
N = 6
input0 = tf.keras.layers.Input(shape=[M,K])
constant=tf.constant(np.random.random([1,K,N]))
input1 = constant #tf.keras.layers.Lambda(lambda x: constant)(input0)
x = tf.keras.layers.Dot(axes=[-1,-2],trainable=False)([input0,input1])
m = tf.keras.Model(inputs=[input0], outputs=[x])
print(m(np.random.random([1,M,K]))) # can run before saving
tf.keras.models.save_model(m,"test_dot.keras")
tf.keras.models.load_model("test_dot.keras") # fail
The immediate error is:
if shape1[axes[0]] != shape2[axes[1]]:
IndexError: list index out of range
shape2 is some unexpected value,
[[[(), (), (), (), (), ()], [(), (), (), (), (), ()], [(), (), (), (), (), ()], [(), (), (), (), (), ()], [(), (), (), (), (), ()]]]
The root cause seems to happen between these 2 places:
…/keras/src/engine/base_layer.py(1063)call()
-> return self._functional_construction_call(,
inputs[0] is <KerasTensor: shape=(None, 3, 5)> as expected
inputs[1] is a 3D array(list) of floats with dimensions [1,5,6]
Then 1 call deeper, …/keras/src/engine/base_layer.py(2603)_functional_construction_call()*
inputs[1] changes to a 3D array of tf.Tensor, each a scalar!
Another symptom is when you load the .keras into Netron and try to see the 2nd operand, it says “ERROR: Invalid tensor data length.” If the 2nd operand isn’t constant, then no issue. This is suggesting to me, dot doesn’t support constant operands? I know I can convert Dot on 2D data into Conv1D, like this, but it would be preferable to use Dot everywhere.
Dot(axes=[-1,-2])(A,B) = A B = Conv1D(weights=Transpose(B))(A)
I’m using Tensorflow 2.13