I am using Undertone for a Unity project to recognize user speech input and transcribe them in real time.
The recognition was largely duplicated from the RealtimeTranscriber.cs
example, which is included in their demo scene and on the website. It uses a public SpeechEngine Engine
and have an event public event TextTranscribed OnTextTranscribed
.
In another script (let me call it SpeechCheck.cs
), the transcription was achieved by having a listener and hook up the event:
RealtimeTranscriber listener
// In initialize()
// Add a listener for when text is transcribed during initialization
listener.OnTextTranscribed += text =>
{
_text = text;
};
// In Update()
listener.StartListening();
// This is called when the user is supposed to say something
// and the `_text` will be updated with the recognized transcription.
Since the user is supposed to speak like in a conversation, in the scene there are several objects that has the SpeechCheck.cs
attached for different handling.
When the first time SpeechCheck.cs
is used and the listener is started, everything works fine, _text
got updated and the recognition is fairly accurate. However, after the first object with SpeechCheck.cs
is disabled, the second object with SpeechCheck.cs
component does not work, _text
remains to be nothing.
To ensure it’s not some parameters that failed to get reset, I even tried to create new components and assign them during run time:
// In initialize()
Engine = this.AddComponent<SpeechEngine>();
this.AddComponent<SpeechListener>();
GetComponent<SpeechListener>().Engine = Engine;
listener = this.GetComponent<SpeechListener>();
// At the end
listener.StopListening();
Destroy(GetComponent<SpeechListener>());
Destroy(GetComponent<SpeechEngine>());
And was of no avail, the listener still only works for the first time’s call.
What may have caused this problem?
Since their website does not show the code, attached is the method headers:
public class RealtimeTranscriber : MonoBehaviour
{
public SpeechEngine Engine;
public event TextTranscribed OnTextTranscribed;
//...
private void Start();
public bool VADTriggered (float[] samples);
private IEnumerator TranscribeCoroutine();
public void StartListening();
public void StopListening();
private async Task TranscribeAudioClipAsync(float[] samples, bool flush);
private void RecalculateWindowSize();
private void OnDestroy();
}