First time using this, so dk what’s wrong.
`files = [“/content/harvard.wav”] # Assuming you have defined this list elsewhere
raw_text = ”
text = ”
for fname, transcription in zip(files, quartznet.transcribe(paths2audio_files=files)):
raw_text = transcription
text = raw_text[0]
print(text)`
TypeError: Output shape mismatch occured for audio_signal in module AudioToCharDataset :
Output shape expected = (batch, time) |
Output shape found : torch.Size([1, 293700, 2])
Tried using different ways to write it. but error exists in the parameters.
audio_file_path = "/content/harvard.wav" # Path to your audio file transcript = quartznet.transcribe(paths2audio_files=audio_file_path) print(transcript)
quartznet = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name="QuartzNet15x5Base-En") wave_file = ["/content/harvard.wav"] # List containing the path to your audio file transcript = quartznet.transcribe(paths2audio_files=wave_file) print(transcript)