https://github.com/lukereichold/visual-speech-separation

Flask app to demo multimodal deep learning speech separation in videos via TensorFlow Serving
https://github.com/lukereichold/visual-speech-separation

3d-convolutional-network computer-vision convolutional-neural-networks deep-learning flask multimodal-deep-learning multisensory speech-separation tensorflow-serving

Last synced: 5 months ago
JSON representation

Flask app to demo multimodal deep learning speech separation in videos via TensorFlow Serving

Host: GitHub
URL: https://github.com/lukereichold/visual-speech-separation
Owner: lukereichold
License: apache-2.0
Created: 2020-01-19T21:55:19.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2022-12-08T03:28:33.000Z (over 3 years ago)
Last Synced: 2025-10-11T07:52:59.858Z (9 months ago)
Topics: 3d-convolutional-network, computer-vision, convolutional-neural-networks, deep-learning, flask, multimodal-deep-learning, multisensory, speech-separation, tensorflow-serving
Language: Python
Homepage:
Size: 20.8 MB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 7
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # About Basis

Basis is a proof-of-concept web app which provides an interactive demonstration of separating on/off-screen audio sources for a given video.

It leverages the [speech separation model created by Andrew Owens et al.](http://andrewowens.com/multisensory/) used for separating on / off-screen audio sources. This project is based upon [open-source code and models](https://github.com/andrewowens/multisensory) licensed under the Apache License 2.0.

I built this as an opportunity to learn:

- Implementation details of a "legacy" TensorFlow 1.x model

- How to freeze, inspect, and host a model using TF-Serving

- How to perform real-time inferencing on video from a public web app

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lukereichold/visual-speech-separation

Awesome Lists containing this project

README