I have a call recording involving an agent and a customer. I need a clear and accurate transcription that identifies which sentences were spoken by each person. The goal is to have a diarized transcript that effectively distinguishes between the agent’s and the customer’s words with 100% accuracy.
Could you please assist with this?
I have tried using Whisper models for call recordings, but they label the speakers as “speaker1” and “speaker2”. I need these labels to be “Agent” and “Customer”. How can I identify which is which at the code level while maintaining transcript accuracy?
I have multiple call recordings, but there are no significant patterns within the calls to easily differentiate the speakers. I am looking for a global solution that ensures accurate transcripts for each call. If anyone can help me achieve this, I would greatly appreciate it.
Himanshu Rami is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.