Relative Content

Tag Archive for openai-whisperautomatic-speech-recognition

Getting chunk level output with start and end timestamps with Whisper

I am using the Whisper3 model to transcribe several audio files. However, the output I am getting is in the form of a tensor. I would like to obtain text chunks with corresponding start and end timestamps instead. Can someone please assist me in achieving this desired output using the available method only? I get the desired output if I make use of pipeline with “AutoModelForSpeechSeq2Seq” class instead of “WhisperForConditionalGeneration” like below.

Thiết kế website giá rẻ

Danh mục

Relative Content

Tag Archive for openai-whisperautomatic-speech-recognition

Getting chunk level output with start and end timestamps with Whisper