I’m working on a CNN project and need some help with adjusting the steps and batch parameters. Here’s a summary of my dataset:
Training images: 5360
Validation images: 1151
Test images: 1147
I have set the following parameters:
Epochs: 20
Batch size: 16
To calculate the steps per epoch, I’m using this code:
nb_train_steps = np.ceil(train_data_gen.samples / batch_size).astype(int)
nb_validation_steps = np.ceil(valid_data_gen.samples / batch_size).astype(int)
nb_test_steps = np.ceil(test_data_gen.samples / batch_size).astype(int)
However, I’m encountering the following error during training:
Epoch 2/20
2024-08-19 01:58:11.161693: I tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node IteratorGetNext}}]]
C:UsersanujaAppDataLocalProgramsPythonPython312Libcontextlib.py:158: UserWarning: Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches. You may need to use the `.repeat()` function when building your dataset.
self.gen.throw(value)
2024-08-19 01:58:11.190224: I tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node IteratorGetNext}}]]
Traceback (most recent call last):
File "c:UsersanujaDesktopProjectScriptsmodel.py", line 140, in <module>
custom_model.fit(
File "C:UsersanujaAppDataLocalProgramsPythonPython312Libsite-packageskerassrcutilstraceback_utils.py", line 122, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:UsersanujaAppDataLocalProgramsPythonPython312Libsite-packageskerassrcbackendtensorflowtrainer.py", line 354, in fit
"val_" + name: val for name, val in val_logs.items()
^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'items'
The issue seems to be that the validation and test datasets have a number of images that isn’t perfectly divisible by the batch size. For instance:
- For the validation set: 1151 / 16 = 71.9375. When rounded up to 72, the program expects an additional image that doesn’t exist in the directory.
- I anticipate a similar issue with the test set.
I thought about deleting images to make the total number easily divisible by 16 (e.g., reducing 1151 to 1136 so that 1136 / 16 = 71), but this feels wasteful. I’m aware of the .repeat() function, which might help, but I’m unsure how it will affect the model, particularly in terms of overfitting.
My Questions:
- How should I handle this situation without deleting any data?
- How does using .repeat() impact the model, especially regarding
overfitting? - Are there better approaches to dealing with this issue?
I work with
TensorFlow version: 2.17.0
Keras version: 3.4.1
3