My current project is to add conversational AI to the nao robot.
At the moment, I’m using naoqi’s ALAudioRecorder and then downloading the wav file via ssh with paramiko to process it with speech to text and send it to gemini or llama3.
Is there a more efficient way of achieving the same result? It takes a few seconds and I’d like to optimise it so that the conversation is more natural.
Best regards,
Milo
New contributor
Milo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.