{"id":23591211,"url":"https://github.com/socaity/py_audio2face","last_synced_at":"2025-04-30T12:35:11.722Z","repository":{"id":211485548,"uuid":"729243586","full_name":"SocAIty/py_audio2face","owner":"SocAIty","description":"Use the NVIDIA Audio2Face headless server and interact with it through a requests API. Generate animation sequences for Unreal Engine 5, Maya and MetaHumans","archived":false,"fork":false,"pushed_at":"2024-07-24T16:11:10.000Z","size":28694,"stargazers_count":85,"open_issues_count":4,"forks_count":15,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-01-17T10:05:44.186Z","etag":null,"topics":["audio2emotion","audio2face","npc","nvidia","unreal-engine","virtual-avatars"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SocAIty.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"SocAIty","patreon":"w4hns1nn","buy_me_a_coffee":"socaity","custom":["paypal.me/SocaityRieger"]}},"created_at":"2023-12-08T17:46:54.000Z","updated_at":"2025-01-14T05:06:55.000Z","dependencies_parsed_at":"2024-07-24T18:33:34.584Z","dependency_job_id":"1a737d56-6652-494a-a218-7dea0bd510dc","html_url":"https://github.com/SocAIty/py_audio2face","commit_stats":null,"previous_names":["w4hns1nn/py_audio2face","socaity/py_audio2face"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SocAIty%2Fpy_audio2face","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SocAIty%2Fpy_audio2face/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SocAIty%2Fpy_audio2face/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SocAIty%2Fpy_audio2face/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SocAIty","download_url":"https://codeload.github.com/SocAIty/py_audio2face/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":235426489,"owners_count":18988420,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio2emotion","audio2face","npc","nvidia","unreal-engine","virtual-avatars"],"created_at":"2024-12-27T07:34:52.141Z","updated_at":"2025-04-30T12:35:11.716Z","avatar_url":"https://github.com/SocAIty.png","language":"Python","funding_links":["https://github.com/sponsors/SocAIty","https://patreon.com/w4hns1nn","https://buymeacoffee.com/socaity","paypal.me/SocaityRieger"],"categories":[],"sub_categories":[],"readme":"  \u003ch1 align=\"center\" style=\"margin-top:-25px\"\u003epy_Audio2Face\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\n  \u003cimg align=\"center\" src=\"docs/py_audio2face_icon.png\" height=\"250\" /\u003e\n\u003c/p\u003e\n  \u003ch2 align=\"center\" style=\"margin-top:-10px\"\u003eGenerate expressive facial animation from audio\u003c/h2\u003e\n\n# Overview\n\nThis Python script leverages the [headless](https://docs.omniverse.nvidia.com/audio2face/latest/user-manual/rest-api.html) mode of [Audio2Face](https://www.nvidia.com/en-us/omniverse/apps/audio2face/) to generate animations for characters:\n- lip movement animations\n- face animations \n- emotions (new since 2023.2.0)\n- It provides methods to control the [Audio2Face headless server](https://docs.omniverse.nvidia.com/audio2face/latest/user-manual/rest-api.html) and interact with it through a requests API.\n\n\nUse cases:\n- Power your games and movies with expressive facial animations. \n- Create natural looking virtual avatars.\n- Generate animations for characters and export them as USD files for example for Maya or Unreal Engine 5.\n- Stream audio data to the Audio2Face server to generate animations in real-time. Use Live-Lynk to use them in UE5\n\n# Demo and Tutorial\n\n\u003ca href=\"https://www.youtube.com/watch?v=L0jOBdXy5IE\" align=\"center\"\u003e\n  \u003cimg align=\"center\" src=\"docs/video_thumbnail.png\" height=\"300\" /\u003e\n\u003c/a\u003e\n\n\u003c/br\u003e\u003c/br\u003e\n**Checkout [socaity.ai](https://www.socaity.ai) for a ton of related generative AI models and free hosting.**\n\n# Installation\n\nInstall the `py_audio2face` package using pip:\n```bash\n# With pip. Note: this version does not include the streaming feature\npip install py_audio2face\n# Install also the streaming feature. This includes additional dependencies like grpcio and protobuf\npip install py_audio2face[streaming]\n# Or install the latest version from GitHub to work with the newest version of Audio2Face\npip install git+https://github.com/SocAIty/py_audio2face.git\n```\n\nModify the `ROOT_DIR`, `DEFAULT_OUTPUT_DIR`, and `DEFAULT_AUDIO_STREAM_GRPC_PORT` variables in the `settings.py` file as needed.\n\n## Prerequisites\n\n- [Audio2Face](https://www.nvidia.com/en-us/omniverse/download/) installed on the system\n- Python 3.x\n\n# Usage\n\nFirst initialize Audio2Face instance:\n```python\nimport py_audio2face as pya2f\na2f = pya2f.Audio2Face()\n```\n\n### Generate animation for audio files:\n\n ```python\n# Generate animation for a single audio file\na2f.audio2face_single(\n    audio_file_path=\"path/to/audio/file.wav\", #  path of the audio file you want to animate\n    output_path=\"path/to/output/animation.usd\",  # path where the animation file will be saved\n    fps=60, # frames per second for the animation. Higher fps will result in smoother animations and longer processing time\n    emotion_auto_detect=True  # automatically detect emotions in the audio file. If false the set emotion will be used\n)\n\n# Generate animation for an entire folder of audio files\na2f.audio2face_folder(input_folder=\"path/to/my/folder\", output_folder='/output', fps=60)\n```\n\n### Configure Emotions\n\nThe emotion mixin let's you control the strength of the emotions in the generated animation. \n```python\n# Applies a preferred emotion even in emotion_auto_detect with update_settings=True\na2f.set_emotion(anger=0.9, disgust=0.5, fear=0.1, sadness=0.2, update_settings=True)\n# Animate with a fixed set of emotione. \nanimated = a2f.audio2face_single(test_audio_1,'emotion_full.usd', fps=24, emotion_auto_detect=False)\n# Testing with expressive emotions. \nwith_preferred_emotion_detect = a2f.audio2face_single(test_audio_1,'emotion_preset_detect.usd', fps=24, emotion_auto_detect=True)\n```\n\nYou can also configure the emotion detection behaviour with the set_settings method. \n```python\na2f.a2e_set_settings(a2e_emotion_strength=0.5, a2e_smoothing_exp=0)\n```\n\n\n### Stream audio to Audio2Face:\n\nInstead of providing paths to the a2f headless server, you can stream the audio data directly to the server. \nThis is useful if you want to generate animations in real-time, for a live stream or in a server setting.\n\nFor this example we use the media-toolkit to stream audio data. Install it with `pip install media-toolkit[AudioFile]\"`.\n\n```python\nfrom media_toolkit import AudioFile\naudio = AudioFile().from_file(\"path/to/audio/file.wav\")\naudio_stream = audio.to_stream()  # note: this can be any python generator that yields numpy arrays/bytes of audio data\na2f.stream_audio(audio_data, output_path=\"path/to/output/animation.usd\", fps=60)\n```\nFor streaming under the hood, a different scene with a streaming audio player is loaded in the init method.\nThen with gRPC requests, the audio data is streamed to the server.\n\n**Shutdown Audio2Face Server:**\n```python\na2f.shutdown_a2f()\n```\n\n# Related Projects\n\nWhy bother about recording audio files? \n- Convert text-to-speech with [SpeechCraft](https://github.com/SocAIty/SpeechCraft). Use the natural sounding speech and feed it into audio2face.\n- Want sound natively like any other character? Use [RVC](https://github.com/SocAIty/Retrieval-based-Voice-Conversion-FastAPI) to clone any voice. This sounds so real, you'll not notice the difference to a real one.\n- Create and animate realistic looking characters with [MetaHuman](https://metahuman.unrealengine.com/) and audio2face\n\nExplore the [Generative AI-Model-Zoo](https://www.socaity.ai/APIs/Overview) on [socaity.ai](https://www.socaity.ai)\n- Use [face2face](https://github.com/SocAIty/face2face) for realistic deep-fakes.\n- Combine multiple tools using the socaity [python-SDK](https://github.com/SocAIty/socaity) or [typescript SDK](https://github.com/SocAIty/socaity-js) for your ultimate genAI project.\n\n# Contribute\n\nAny contribution is appreciated.\n- [x] streaming: better clean-code for setting streaming mode. Especially with emotions, and default isntance.\n- [x] streaming: upgrade to protobuf 5\n- [x] streaming: allow streaming back of blendshapes or on the fly export\n- [x] create working unit tests\n- [x] Provide different characters not only mark.usd\n- [x] Allow multiple generations / streams at the same time\n\nPlease raise an issue if you have any suggestions, feature requests or need help with the script.\n\n\n# Acknowledgements\n\nBig thank you for the NVIDIA Team who made Audio2Face a great tool.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsocaity%2Fpy_audio2face","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsocaity%2Fpy_audio2face","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsocaity%2Fpy_audio2face/lists"}