I don’t understand how I can read the bytestream for a TTS azure service in python.
From the docs: https://learn.microsoft.com/en-us/python/api/azure-cognitiveservices-speech/azure.cognitiveservices.speech.audiodatastream?view=azure-python
bool = can_read_data(requested_bytes: int, pos: int)
and
int = read_data(audio_buffer: bytes, pos: int | None = None)
sooo..
import azure.cognitiveservices.speech as speechsdk
speech_config = speechsdk.SpeechConfig(subscription='key', region='region')
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None)
text = "Hello, world!"
# Synthesize the speech
result = speech_synthesizer.speak_text_async(text).get()
# Create an AudioDataStream from the synthesized result
stream = speechsdk.AudioDataStream(result)
# Initialize Pygame - we will use this to play the audio
import pygame
pygame.mixer.init()
def play_audio(stream):
chunk_size = 1024 # Size of each chunk to read
audio_buffer = bytes() # Bytes object to store audio data
# Read and append audio data in chunks
try:
while stream.can_read_data(chunk_size):
bytes_read = stream.read_data(chunk_size) # Read a chunk of data as bytes
audio_buffer += bytes_read # Append the chunk to the bytes object
except Exception as e:
logging.error("[play_audio] Error during playback: {}".format(str(e)))
# Play the audio with Pygame
try:
pygame.mixer.Sound(audio_buffer).play()
print(audio_buffer)
except pygame.error:
print("Error playing sound")
# Call the play_audio function with the audio stream
play_audio(stream)
NB: I could cheat and do stream.save_to_wav_file BUT i want to stream this.. (so I can play and pause it etc..)
I cant figure this out. I feel the docs just went shy of showing a in practice use of this..
But then maybe Im not reading the docs right!