I am training a LSTM model and validating it using “leave one out” cross validation. I have 11520 samples, so I have to train a new model 11520 times. I am using loop for each split of data given by ‘LeaveOneOut” function of scikit learn, inside that loop I am initializing a new model, train it, predict test set and then using “keras.backend.clear_session()” to clear old model, after that using “tf.compat.v1.reset_default_graph()” to reset graph, then using “gc.collect()” to collect grabage. Initially models are trained in nearly 6-7 seconds but after training 600 models training time increases to 25-50 seconds. Here is my code..
`def get_model(channels):
model2 = keras.models.Sequential()
model2.add(keras.layers.LSTM(64, return_sequences=False))
model2.add(keras.layers.Dense(1, activation='sigmoid'))
model2.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model2`
`def leaveOneOutCVLSTM(X, y, epochs, batch_size, validation_split):
X_shuffle, y_shuffle = shuffle(X, y, random_state=42)
cv = LeaveOneOut()
# enumerate splits
y_true, y_pred = list(), list()
i = 1
for train_ix, test_ix in cv.split(X_shuffle):
# split data
X_train, X_test = X_shuffle[train_ix, :], X_shuffle[test_ix, :]
scalers = {}
X_train_scaled = np.random.rand(X_train.shape[0], X_train.shape[1], X_train.shape[2])
X_test_scaled = np.random.rand(X_test.shape[0], X_test.shape[1], X_test.shape[2])
for j in range(X_train.shape[2]):
scalers[j] = StandardScaler()
X_train_scaled[:, :, j] = scalers[j].fit_transform(X_train[:, :, j])
for j in range(X_test.shape[2]):
X_test_scaled[:, :, j] = scalers[j].transform(X_test[:, :, j])
y_train, y_test = y_shuffle[train_ix], y_shuffle[test_ix]
# fit model
new_model = get_model(X_train_scaled.shape[1])
st = time()
if(i <= 5):
new_model.fit(X_train_scaled, y_train, epochs=epochs, batch_size=batch_size, validation_split=validation_split)
else:
new_model.fit(X_train_scaled, y_train, epochs=epochs, batch_size=batch_size, validation_split=validation_split, verbose=False)
# evaluate model
y_hat = new_model.predict(X_test_scaled, verbose=False)
ed = time()
dr = ed-st
print(i," ",dr)
# store
y_true.append(y_test[0])
y_pred.append(y_hat[0])
i+=1
keras.backend.clear_session()
tf.compat.v1.reset_default_graph()
gc.collect()
return y_true, y_pred`
X_batch = X.reshape(11520, 1, 156)
y_true, y_pred = leaveOneOutCVLSTM(X_batch, Y, 10, 32, 0.2)
And here is the training time I have printed from 1st iteration to 16th iteration
`1 8.427346229553223
2 7.397351503372192
3 7.472941875457764
4 7.418887615203857
5 7.5288026332855225
6 6.432919502258301
7 6.417744398117065
8 6.312522649765015
9 6.350329160690308
10 6.340737342834473
11 6.3199241161346436
12 6.310317039489746
13 6.3174097537994385
14 6.346491813659668
15 6.2766053676605225
16 6.296995401382446`
and for 600th iteration to 616th iteration
`600 26.77048420906067
601 20.864712238311768
602 20.118656873703003
603 23.869750022888184
604 23.6923668384552
605 26.10648512840271
606 23.909359216690063
607 36.399033069610596
608 22.179851055145264
609 16.407938718795776
610 30.585895776748657
611 23.5596022605896
612 25.86080241203308
613 44.86601257324219
614 23.27703547477722
615 24.88290023803711
616 19.156887531280518`
I have used “keras.backend.clear_session()”, “tf.compat.v1.reset_default_graph()”, “gc.collect()” to clear overhead but still training time increases. So I want to know 1)Why training time increases after 600 iterations of loop? 2) What should I do so that training time remains 6-7seconds?
Hritik Ritesh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.