I am attempting to use YAMNet for transfer learning to perform speech command recognition. I plan to train the model using the mini speech commands dataset from the simple audio project. The goal is to deploy the trained model on an Android device.
I have written the following code to create a new model using the YAMNet architecture with a custom classifier for speech command recognition:
import tensorflow as tf
import tensorflow_hub as hub
num_classes = 8
new_model = tf.keras.Sequential([
hub.KerasLayer("https://tfhub.dev/google/yamnet/1", output_key='embeddings', output_shape=[1024], trainable=False),
tf.keras.layers.Dense(num_classes, activation='softmax')
])
new_model.build([None, 16000])
new_model.summary()
However, I’m encountering an error when trying to run this setup:
Call arguments received by layer 'keras_layer_34' (type KerasLayer):
• inputs=tf.Tensor(shape=(None, 16000), dtype=float32)
• training=None.
I suspect the error might be related to how the input is being processed or maybe an issue with the configuration of the KerasLayer for YAMNet. Can anyone help identify what might be causing this error and suggest how to resolve it? Thank you in advance!