An open API service indexing awesome lists of open source software.

https://github.com/lukereichold/visual-speech-separation

Flask app to demo multimodal deep learning speech separation in videos via TensorFlow Serving
https://github.com/lukereichold/visual-speech-separation

3d-convolutional-network computer-vision convolutional-neural-networks deep-learning flask multimodal-deep-learning multisensory speech-separation tensorflow-serving

Last synced: about 1 month ago
JSON representation

Flask app to demo multimodal deep learning speech separation in videos via TensorFlow Serving

Awesome Lists containing this project

README

          

# About Basis

Basis is a proof-of-concept web app which provides an interactive demonstration of separating on/off-screen audio sources for a given video.

It leverages the [speech separation model created by Andrew Owens et al.](http://andrewowens.com/multisensory/) used for separating on / off-screen audio sources. This project is based upon [open-source code and models](https://github.com/andrewowens/multisensory) licensed under the Apache License 2.0.

I built this as an opportunity to learn:

- Implementation details of a "legacy" TensorFlow 1.x model
- How to freeze, inspect, and host a model using TF-Serving
- How to perform real-time inferencing on video from a public web app