We’re developing a web application with the following requirements:
Create and display multiple custom avatars in the UI
Allow users to speak to any avatar
Generate responses using LLMs on the backend
Display avatar animations synced with the responses
Our current approach:
Using Nvidia’s Audio2Face to create custom avatars and export as USD files
User speech flow:
Convert audio to text
Use LLM to generate an answer
Convert answer text to audio
Send audio to Audio2Face via REST API
Display the resulting animation in the UI.
Downloaded Omniverse software and Audio2Face service
Created avatars and exported as USD files.
Explored rest apis of audio2face service.
How to incorporate the USD files into our web UI?
How to display the avatar animations in response to user questions?
Constraints:
Must be implemented on a local GPU system
Cannot use third-party APIs for this purpose
Any guidance or suggestions on how to accomplish this would be greatly appreciated. Thank you!your text