I have an infinte generator:
def get_random_distribution_from_generator(N=10):
while True:
k = random.randint(0, 1)
if k==0:
cls = (1,0)
sample = np.random.uniform(-2, 8, LEN)
elif k==1:
cls = (0,1)
sample = np.random.normal(3,1,LEN)
yield sample, cls
It draws a fixed number of random numbers from either a uniform or a normal distribution. It yield the sample and a tuple encoding the class (one-hot-encoding).
Now I want to create a Tensorflow dataset from this.
tfd = tf.data.Dataset.from_generator(get_random_distribution_from_generator(5), output_signature=(tf.TensorSpec(shape=(LEN), dtype=tf.float32, name=None), tf.TensorSpec(shape=(2), dtype=tf.int8, name=None)))
tfd_batch_wise = tfd.shuffle(1000).batch(BATCH_SIZE).prefetch(1)
I can use this to start to train a neural network. Well, since the generator is infinite, it trains forever.
Is there a way to have a flexible but fixed number of values in the Tensorflow dataset?
I know:
-
I could use take(N) and create a fixed size Tensorflow dataset from the infinte version. But I do not want to keep it in memory. I chose the generator in order to be flexible concering the size of the dataset.
-
I can create a generator that yields a fixed number of elements. This is not very flexible. If I want to change the size, I have to change the code.
I tried:
- create a generator class. Here I get the error message that the genertor must be callable, which is not true after creating the object.
Is there a simpler solution that
- lets me specify the maximum number of elements that will be created for the Tensorflow dataset from an infinte generator.
- The generator must be callable to be used in from_generator method.