I am training a cycleGAN model, 40 pictures in each domain. The epochs are supposed to be 50, with batch_size=1, the no of total iterations is 2000 and the batch_per_epoch is 40, however the model stops training after iteration 90, there are no conditions that would even pass 90 as true and the model stops.
def train(d_model_A, d_model_B, g_model_AtoB, g_model_BtoA, c_model_AtoB, c_model_BtoA, dataset, epochs=1):
# define properties of the training run
n_epochs, n_batch, = epochs, 1 #batch size fixed to 1 as suggested in the paper
# determine the output square shape of the discriminator
n_patch = d_model_A.output_shape[1]
# unpack dataset
trainA, trainB = dataset
# print(trainA.shape,trainB.shape)
# prepare image pool for fake images
poolA, poolB = list(), list()
# calculate the number of batches per training epoch
bat_per_epo = int(len(trainA) / n_batch)
# calculate the number of training iterations
n_steps = bat_per_epo * n_epochs
print(n_steps,bat_per_epo)
# manually enumerate epochs
for i in range(n_steps):
# select a batch of real samples from each domain (A and B)
X_realA, y_realA = generate_real_samples(trainA, n_batch, n_patch)
X_realB, y_realB = generate_real_samples(trainB, n_batch, n_patch)
# generate a batch of fake samples using both B to A and A to B generators.
X_fakeA, y_fakeA = generate_fake_samples(g_model_BtoA, X_realB, n_patch)
X_fakeB, y_fakeB = generate_fake_samples(g_model_AtoB, X_realA, n_patch)
# update fake images in the pool. Remember that the paper suggstes a buffer of 50 images
X_fakeA = update_image_pool(poolA, X_fakeA)
X_fakeB = update_image_pool(poolB, X_fakeB)
# update generator B->A via the composite model
# print(type(X_realA),type(X_realB),type(X_fakeA),type(X_fakeB),type(y_realA),type(y_realB),type(y_fakeA),type(y_fakeB))
g_loss2 = c_model_BtoA.train_on_batch([X_realB, X_realA], [y_realA, X_realA, X_realB, X_realA])
# update discriminator for A -> [real/fake]
dA_loss1 = d_model_A.train_on_batch(X_realA, y_realA)
dA_loss2 = d_model_A.train_on_batch(X_fakeA, y_fakeA)
# update generator A->B via the composite model
g_loss1 = c_model_AtoB.train_on_batch([X_realA, X_realB], [y_realB, X_realB, X_realA, X_realB])
# update discriminator for B -> [real/fake]
dB_loss1 = d_model_B.train_on_batch(X_realB, y_realB)
dB_loss2 = d_model_B.train_on_batch(X_fakeB, y_fakeB)
# summarize performance
#Since our batch size =1, the number of iterations would be same as the size of our dataset.
#In one epoch you'd have iterations equal to the number of images.
#If you have 100 images then 1 epoch would be 100 iterations
print('Iteration>%d, dA[%.3f,%.3f] dB[%.3f,%.3f] g[%.3f,%.3f]' % (i+1, dA_loss1,dA_loss2, dB_loss1,dB_loss2, g_loss1,g_loss2))
# evaluate the model performance periodically
#If batch size (total images)=100, performance will be summarized after every 75th iteration.
if (i+1) % (bat_per_epo * 1) == 0:
# plot A->B translation
summarize_performance(i, g_model_AtoB, trainA, 'AtoB')
# plot B->A translation
summarize_performance(i, g_model_BtoA, trainB, 'BtoA')
if (i+1) % (bat_per_epo * 5) == 0:
# save the models
# #If batch size (total images)=100, model will be saved after
#every 75th iteration x 5 = 375 iterations.
save_models(i, g_model_AtoB, g_model_BtoA)
This is the train function, epochs are passed as 50, and the last condition:
if (i+1) % (bat_per_epo * 5) == 0:
# save the models
# #If batch size (total images)=100, model will be saved after
#every 75th iteration x 5 = 375 iterations.
save_models(i, g_model_AtoB, g_model_BtoA)
does not get executed.
The model seems to be training and improving as the generator loss is decreasing, the resulting image isn’t that bad giving it only trained for 40 iterations and 80 iterations, pictures were generated and the model was saved after the 40th and 80th iterations, then stopped.
Here is the results I got:
after 40 iterations
after 80 iterations
Noha Atef is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.