https://github.com/rathod-shubham/nvidia-audio2face

OmniAvatar combines NVIDIA Omniverse Audio2Face and LLMs to create avatars that interact with lifelike expression and context-aware responses, making virtual communication more immersive and dynamic in real-time.
https://github.com/rathod-shubham/nvidia-audio2face

agents api asr audio2face avatar fastapi llm nvidia omniverse openai python3 tts

Last synced: 8 months ago
JSON representation

Host: GitHub
URL: https://github.com/rathod-shubham/nvidia-audio2face
Owner: RATHOD-SHUBHAM
Created: 2024-11-08T11:39:19.000Z (11 months ago)
Default Branch: master
Last Pushed: 2024-11-08T12:14:51.000Z (11 months ago)
Last Synced: 2025-02-26T01:43:33.486Z (8 months ago)
Topics: agents, api, asr, audio2face, avatar, fastapi, llm, nvidia, omniverse, openai, python3, tts
Language: Jupyter Notebook
Homepage: https://www.linkedin.com/posts/shubhamshankar_avatar-langchain-ai-activity-7256653777858908161-79-g?utm_source=share&utm_medium=member_desktop
Size: 1.26 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# NVIDIA-Audio2Face

# OmniAvatar
Welcome to OmniAvatar, a project that brings lifelike virtual avatars to reality by seamlessly integrating Large Language Models (LLMs) with NVIDIA Omniverse’s Audio2Face technology. Imagine avatars that don’t just look real but engage in dynamic, real-time conversations, responding to emotions and context with authenticity and depth. OmniAvatar transforms static digital characters into vivid personalities, revolutionizing how virtual interactions feel and respond.

## Demo
[OMNIAVATAR](https://www.linkedin.com/posts/shubhamshankar_avatar-langchain-ai-activity-7256653777858908161-79-g?utm_source=share&utm_medium=member_desktop)

## Overview
OmniAvatar leverages the power of LLMs and NVIDIA Omniverse Audio2Face to create avatars capable of real-time interaction and expression. NVIDIA Omniverse provides a solid foundation with its APIs, SDKs, and services, enabling developers to build advanced AI-driven systems with sophisticated simulation workflows. Through OmniAvatar, avatars can not only mimic human speech and expression but also respond intelligently and naturally, creating a more engaging and lifelike interaction experience.

## Project Structure
This repository is organized into two main parts, each serving a key role in building OmniAvatar:

### 1] LLM Code
Contains all code related to the LLM that powers the conversation and response generation. This module enables the avatar to understand and process user inputs and generate meaningful, contextually relevant replies.

### 2] Streaming Server
This module connects the LLM to the avatar, acting as the bridge between text-based intelligence and visual, expressive output. It streams data between the LLM and Audio2Face to synchronize audio, text, and facial animation.

## Getting Started
1. Clone the repository.
2. Follow the setup instructions in each module’s README file to install dependencies and configure the environment.
3. Start the Streaming Server to enable real-time communication between the LLM and the avatar in Omniverse Audio2Face.
4. Run the LLM Code to initialize conversation processing.

### Prerequisites
1. NVIDIA Omniverse installed and configured.
2. Access to Audio2Face in the Omniverse environment.
3. Python 3.x for LLM code execution.
4. Streaming server dependencies as outlined in the Streaming Server folder.

## Customizing Avatar Configuration
To make it easier to customize avatars in OmniAvatar, changes needed in the test_client.py file within the Streaming Server folder is provided below:

The Streaming Server module provides flexibility to change the avatar configuration, including setting personalized avatars by adjusting paths in the test_client.py script.

### Steps to Customize
1. Locate test_client.py
Go to the Streaming Server folder and open test_client.py.

2. Set the Audio File Path
Update the path to the WAV audio file that will be streamed to Audio2Face:
```
# Local input WAV file path
audio_fpath = r"C:\Users\SmgHima\Desktop\AvatarAI-oct-14\LLMCode\voiceservice\outputaudio\audio.wav"
Replace this with the desired path for the audio input file.
```

3. Set the Instance Name
Choose the appropriate instance_name for the Audio2Face Streaming Audio Player. This setting directs where to push the audio data on the Omniverse stage:
```
# Prim path of the Audio2Face Streaming Audio Player on the stage (where to push the audio data)
instance_name = "/World/audio2face/PlayerStreaming"
```
By default, instance_name is set to /World/audio2face/PlayerStreaming. To personalize the avatar setup, replace this path with the appropriate instance path for your specific configuration.

---

Screenshot 2024-10-25 at 9 22 10 PM

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rathod-shubham/nvidia-audio2face

Awesome Lists containing this project

README