So I am trying to run faster whisper in Azure AI Machine Learning Studio.
The code is the sample code from https://github.com/SYSTRAN/faster-whisper and runs fine on my laptop (just CPU instead of GPU).
The Notebook in ML Studio is connected to a Tesla T4 GPU.
torch.cuda.get_device_name(0)
returns:
'Tesla T4'
Below is the code I ran:
from faster_whisper import WhisperModel
folder="Marko"
model_size = "tiny"
# Run on GPU with FP16
model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
segments, info = model.transcribe(f"{folder}/{folder}.wav", beam_size=5)
for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
The Model gets downloaded, CPU usage hits 100% and then the instance crashes.
Just for the fun of it I tried using normal Whisper instead:
import whisper
folder="Marko"
model_size = "medium"
model = whisper.load_model(model_size)
result = model.transcribe(f"{folder}/{folder}.wav")
print(result["text"])
This works without problems.
Anyone who can point me in the right direction?