I’m trying to search around for the implementation GDCN, an updated version for DCN but seems like it’s not yet supported.
I’m trying to tweak the Cross layer implementation by adding gate layers with sigmoid activation:
self._gate_u = tf.keras.layers.Dense(
self._projection_dim,
kernel_initializer=_clone_initializer(self._kernel_initializer),
kernel_regularizer=self._kernel_regularizer,
use_bias=False,
dtype=self.dtype,
)
self._gate_v = tf.keras.layers.Dense(
last_dim,
kernel_initializer=_clone_initializer(self._kernel_initializer),
bias_initializer=self._bias_initializer,
kernel_regularizer=self._kernel_regularizer,
bias_regularizer=self._bias_regularizer,
use_bias=self._use_bias,
dtype=self.dtype,
activation="sigmoid",
)
....
def call:
return x0 * prod_output + self._gate_v(self._gate_u(x)) + x
But loss doesn’t converge for my use case. Is the implementation correct?