I’m using Tensorflow 2.16.1. Ultimately, for each training sample, I need to create hidden and cell tensors for use with LSTM. Different training samples require different sized hidden/cell tensors.
My thought process was to subclass PyDataset and generate batches of integers. Each integer in a batch represents the size of the LSTM tensors.
I’m getting stuck on how to generate these tensors when I have to deal with batches. If I know the batch size, I guess maybe I can do this:
class StateLayer(tf.keras.layers.Layer):
def __init__(self, batch_size=3):
super(StateLayer, self).__init__()
self.batch_size = batch_size
denom = tf.sqrt(tf.cast(4, tf.float32))
self.state_init = tf.math.divide(tf.random.normal(shape=[1, 4]), denom)
def call(self, inputs):
states = [tf.tile(self.state_init, [inputs[i], 1]) for i in range(self.batch_size)]
return tf.ragged.stack(states)
Is this a reasonable approach? I keep thinking there must be a way to do this without batch_size, but I can’t figure it out.