I recently had to switch to using WSL 2 in order to enable GPU computing with Tensorflow. I had a piece of code working on windows (CPU) that tunes three hyperparameters using keras-tuner. However, when switching to WSL 2, suddenly, after X trials, the tuning stops:
Sometimes it gets stuck after 3 trials, sometimes 2. It doesn’t crash, it just stays at epoch 1/3, hence no error message. This is the tuning code (which worked on windows).
tuner = kt.Hyperband(build_tune_model,
objective='val_accuracy',
max_epochs=25,
factor=3,
directory='keras_tuner',
project_name='hyperband_tune')
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
tensorboard_callback = tf.keras.callbacks.TensorBoard("/tmp/tb_logs")
tuner.search(
ds_train,
epochs=400,
validation_data=ds_valid,
steps_per_epoch=train_size // BATCH_SIZE,
validation_steps=valid_size,
callbacks=[early_stopping, tensorboard_callback]
)
Things I have tried:
- Waiting for an hour (previous trials took less than 4 min to complete)
- Using a different tuners (Bayesian optimization and HyperBand)
- Checking performance stats of PC. It seems the CPU is being used (53%), while the GPU is completely unused (makes sense as its not starting the epoch). The CPU usage drops when interrupting the kernel.
Extra info:
-
I use Python 3.11.3.
-
I installed tensorflow with
python3.11 -m pip install tensorflow[and-cuda]
which gave no error messages. This installed
tensorflow 2.17.0. Before installing, I had removed any existing CUDA installations on my pc, just to be sure. -
I installed WSL 2 using
wsl --install
. I use the default Ubuntu distro.
Worth noting:
I get the following warnings when loading Tensorflow, that I did not get before switching over to WSL 2:
Could one of these warnings be the issue? Also, I use Jupyter Notebook, I will try to run my code from the terminal and see if that works.
Any help I would greatly appreciate. If there is any info missing, please let me know. Thank you in advance.