{"id":14036955,"url":"https://github.com/FIGLAB/ubicoustics","last_synced_at":"2025-07-27T04:33:31.443Z","repository":{"id":47309451,"uuid":"141447920","full_name":"FIGLAB/ubicoustics","owner":"FIGLAB","description":"Accompanying repository for Ubicoustics: Plug-and-Play Acoustic Activity Recognition","archived":false,"fork":false,"pushed_at":"2022-10-04T13:12:25.000Z","size":21118,"stargazers_count":168,"open_issues_count":4,"forks_count":47,"subscribers_count":14,"default_branch":"master","last_synced_at":"2024-04-27T23:32:29.875Z","etag":null,"topics":["activity","audio","audio-event-predictions","microphone","sound-activity"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FIGLAB.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-07-18T14:41:55.000Z","updated_at":"2024-04-10T04:45:42.000Z","dependencies_parsed_at":"2023-01-19T05:15:59.740Z","dependency_job_id":null,"html_url":"https://github.com/FIGLAB/ubicoustics","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FIGLAB%2Fubicoustics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FIGLAB%2Fubicoustics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FIGLAB%2Fubicoustics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FIGLAB%2Fubicoustics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FIGLAB","download_url":"https://codeload.github.com/FIGLAB/ubicoustics/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":214982086,"owners_count":15811653,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["activity","audio","audio-event-predictions","microphone","sound-activity"],"created_at":"2024-08-12T03:02:21.437Z","updated_at":"2025-07-27T04:33:31.428Z","avatar_url":"https://github.com/FIGLAB.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Research Code for Ubicoustics\nThis is the research repository for Ubicoustics: Plug-and-Play Acoustic Activity Recognition (UIST 2018). It contains the base toolchain and demo for Ubicoustics. [More information about the research can be found here](http://www.gierad.com/projects/ubicoustics).\n\n![](https://github.com/FIGLAB/ubicoustics/blob/master/media/ubicoustics_a.gif?raw=true)\n![](https://github.com/FIGLAB/ubicoustics/blob/master/media/ubicoustics_b.gif?raw=true)\n\n# System Requirements\nThe deep learning system is written in `python3`, specifically `tensorflow` and `keras`.\n\n## Installation\n```bash\nconda create -n \"ubicoustics\" python=3.8\nconda activate ubicoustics\nconda install numpy\nconda install scipy\npython -m pip install tensorflow\npython -m pip install wget\nconda install pyaudio\n```\n\n# Example Demos\nOnce the dependencies above are installed, try these four demos.  It requires our pre-trained model that is not part of this repo (due to filesize restrictions), but we provide a downloader script to simplify this process.\n\n## Example #0: File Prediction (Simple)\nThis simple demo reads an audio file, extracts features, and feeds them into the ML model for offline prediction. Once you've installed all dependencies, run `example_fileprediction_simple.py`:\n\n```shell\n(ubicoustics)$ python example_fileprediction_simple.py\n```\nThe script will automatically download a model file called `example_model.hdf5` into the `/models` directory (if it doesn't exist). It's an 865.8MB file, so the download might take a while depending on your Internet connection. The script above will perform audio event detection on `example.wav`.\n\n```shell\n=====\nChecking model...\n=====\nDownloading example_model.hdf5 [867MB]:\n100% [.......................................]\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Toilet Flushing (1.00)\nPrediction: Toilet Flushing (1.00)\nPrediction: Toilet Flushing (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Knocking (1.00)\nPrediction: Knocking (0.99)\nPrediction: Knocking (0.91)\n```\n\n## Example #1: File Prediction (Playback)\nNext, you can run the demo that plays back an audio file via `example_fileprediction_playback.py`:\n\n```shell\n(ubicoustics)$ python example_fileprediction_playback.py\n```\n\nIt's similar to the previous example, but it uses `pyaudio`'s non-blocking mechanism to process audio buffers at a given sample length. We insert `ubicoustics` predictions within that block:\n\n```python\n# Audio FORMAT\nFORMAT = pyaudio.paInt16\nCHANNELS = 1\nRATE = 16000\nCHUNK = RATE\n\n# Input files\nwf = wave.open('example.wav', 'rb')\n\n# Callback\ndef audio_samples(input, frame_count, time_info, status_flags):\n  in_data = wf.readframes(frame_count)\n  # Audio Processing Code here\n  # ...\n  return (in_data, pyaudio.paContinue)\n\n# Non-Blocking Call\np = pyaudio.PyAudio()\nstream = p.open(format=FORMAT, channels=CHANNELS, rate=RATE, output=True, frames_per_buffer=CHUNK, stream_callback=audio_samples)\n\n```\nThe script will use your computer's speakers to playback the audio file while displaying its predictions. If everything runs correctly, you should get the following output:\n\n```shell\n=====\nChecking model...\n=====\nBeginning prediction for example.wav (use speakers for playback):\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Coughing (0.06)\nPrediction: Toilet Flushing (1.00)\nPrediction: Toilet Flushing (1.00)\nPrediction: Toilet Flushing (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Chopping (0.96)\nPrediction: Knocking (1.00)\nPrediction: Knocking (1.00)\nPrediction: Knocking (0.99)\n```\n\n\n## Example #2: Live Prediction (Simple)\n\nThis next example will use your system microphone to perform live audio event predictions.\n\n```bash\n(ubicoustics)$ python example_liveprediction_simple.py\n```\n\nIt will check your system for a list of available microphones, and it will prompt you to choose one.\n\n```\n=====\n1 / 2: Checking Microphones...\n=====\n=== Available Microphones: ===\n# 0 - Built-in Microphone\n# 1 - Gierad's Apple AirPods\n======================================\nSelect microphone [0]: 0\n```\n\nYou can also use the `--mic \u003cmic-id\u003e` flag to specify which local microphone to use.\n\n```\n(ubicoustics)$ python example_liveprediction_simple.py --mic 0\n```\n\nThe system will then perform audio event predictions using the chosen microphone. Your output should look something like this:\n\n```\nUsing mic: # 0 - Built-in Microphone\n=====\n2 / 2: Checking model...\n=====\n# Live Prediction Using Microphone: # 0 - Built-in Microphone\nPrediction: Typing (0.97)\nPrediction: Typing (0.97)\nPrediction: Typing (1.00)\nPrediction: Knocking (1.00)\nPrediction: Knocking (0.46)\nPrediction: Door In-Use (0.15)\nPrediction: Door In-Use (0.12)\nPrediction: Door In-Use (0.12)\nPrediction: Phone Ringing (0.46)\nPrediction: Phone Ringing (0.86)\n```\n\nThis script makes predictions every second. This script should run on most platforms, including a `Raspberry Pi B+` (1GB of memory, 16GB disk space,  with 4GB set as SWAP). If you need help setting up your RPi, send an email to Gierad (gierad.laput@cs.cmu.edu).\n\nTo manually grab a list of microphones for your system, just run `microphones.py`:\n```bash\n(ubicoustics)$ python microphones.py\n```\n\n## Example #3: Live Prediction (Detail View)\n\nTo form a better intuition on the system's behavior, this final example shows all the knobs you can turn, including the system's confidence values, along with audio levels and some parameters that you can tweak (e.g., thresholds) for real-time prediction.\n\n```\n(ubicoustics)$ python example_liveprediction_detail.py\n```\n\nHere's an example screenshot:\n\n![](https://github.com/FIGLAB/ubicoustics/blob/master/media/example_liveprediction_detail.gif?raw=true)\n\nPrediction confidence values will be shown in real time. The system checks whether the highest confidence value exceeds a given threshold AND wether the audio levels are significant enough to warrant an event trigger (e.g., \u003e -40dB). These parameters can be adjusted using the `PREDICTION_THRES` and `DBLEVEL_THRES` parameters:\n\n```python\nPREDICTION_THRES = 0.8 # confidence\nDBLEVEL_THRES = -40 # dB\n```\n\n# Reference\nGierad Laput, Karan Ahuja, Mayank Goel, Chris Harrison. 2018. Ubicoustics: Plug-and-Play Acoustic Activity Recognition. In Proceedings of the 31st Annual Symposium on User Interface Software and Technology (UIST '18). ACM, New York, NY, USA.\n\n[Download the paper here](http://www.gierad.com/assets/ubicoustics/ubicoustics.pdf).\n\nBibTex Reference:\n```\n@inproceedings {ubicoustics,\n  author={Laput, G. and Ahuja, K. and Goel, M. and Harrison, C},\n  title={Ubicoustics: Plug-and-Play Acoustic Activity Recognition},\n  booktitle={Proceedings of the 31st Annual Symposium on User Interface Software and Technology},\n  series={USIT '18},\n  year={2018},\n  location={Berlin, Germany},\n  numpages={10},\n  publisher={ACM},\n  address={New York, NY, USA}\n}\n```\n\n# License\nDue to licensing restrictions, audio `wav` files and training code are only available by request (for now), and can only be used for research i.e., non-commercial purposes. Otherwise, Ubicoustics is freely available for non-commercial use, and may be redistributed under these conditions. Please see the license file for further details. For a commercial license, please contact Gierad Laput, Chris Harrison, and the CMU Technology Transfer Office (innovation@cmu.edu).\n\n# Appendix A: Raspberry Pi\n\nWe've received several requests to document how we made Ubicoustics run on a Raspberry Pi. Be forewarned that we're pushing the limits of what an RPi can do, but that's part of the fun. If you're all for it, here's what you'll need:\n\n## Hardware\n\n1. [Raspberry Pi 3 Model B+ - CortexA53 with 1GB](https://www.adafruit.com/product/3775)\n\n![](https://cdn-shop.adafruit.com/970x728/3775-04.jpg)\n\n2. [ReSpeaker 2-Mics Pi HAT (or better)](https://m.seeedstudio.com/productDetail/2874)\n\n![](https://statics3.seeedstudio.com/seeed/img/2017-06/HwIkCfuzRZ5EauL7Q4xmaj3D.jpg)\n\n3. [16GB Micro SD Card (Class 10)](https://www.amazon.com/SanDisk-COMINU024966-16GB-microSD-Card/dp/B004KSMXVM)\n\n![](https://isabela.iweb.co.uk/resize/ZT0xMjA5NjAwJmg9NTAwJnE9NzUmdD1vdXRib3VuZCZ1cmw9aHR0cHMlM0ElMkYlMkZzdGF0aWMubXltZW1vcnkuY28udWslMkZtZWRpYSUyRmNhdGFsb2clMkZwcm9kdWN0JTJGUyUyRmElMkZTYW5EaXNrLTM0NjQ0NEIuanBnJnc9NTAw/)\n\n4. [5V 2.5A Switching Power Supply (Highly Recommended)](https://www.adafruit.com/product/1995)\n\n![](https://cdn-shop.adafruit.com/970x728/1995-02.jpg)\n\n5. Alternative: Google AIY Kit\n\nYou can also use Google's AIY Voice Kit as a starting point. It uses roughly the same configuration as above, except you'll need a 16GB Mini SD Card (the AIY Kit comes with an 8GB card). It is markedly slower than an RPi B+.\n\nYou can buy the Google AIY Voice Kit from [Target](https://www.target.com/p/-/A-53416295).\n\n![](https://aiyprojects.withgoogle.com/static/images/voice-v2/guide/voice-000.jpg)\n\nThe Google AIY Kit is the absolute bare-bones configuration that we've tested. Again, it is significantly slower than an RPi B+ (but it has built-in Google integrations that you might find useful for other apps).\n\n## Software and Configuration\n\n\n### Install Raspbian OS\nWe recommend using Raspbian Lite as the operating system, but other flavors will do. There's an entire process that documents [how to flash your SD card with the latest Raspbian OS](https://www.raspberrypi.org/documentation/installation/installing-images/). If you're running macOS, we recommend [going through this documentation](https://www.raspberrypi.org/documentation/installation/installing-images/mac.md).\n\n### Install Microphone Drivers\n\nNext, install the audio drivers for the ReSpeaker 2-Mics Pi HAT. Detailed documentation about the process [can be found here](http://wiki.seeedstudio.com/ReSpeaker_2_Mics_Pi_HAT/).\n\n```bash\n$ git clone https://github.com/respeaker/seeed-voicecard.git\n$ cd seeed-voicecard\n$ sudo ./install.sh\n$ reboot\n```\n\n### Configure SWAP Space\nOnce flashed, change your RPi configuration so that it uses extra memory SWAP space. You can do this by editing `/etc/dphys-swapfile`:\n\n```\n$ sudo nano /etc/dphys-swapfile\n```\n\nUncomment the `CONF_SWAPSIZE`  and `CONF_MAXSWAP` parameters, and set them to a value that is roughly above 1G. In our case, we've set it to `4096`.\n\n```\nCONF_SWAPSIZE=4096\nCONF_MAXSWAP=4096\n```\n\n### Install `virtualenv`\nOnce you've configured your SWAP space, we recommend creating a `virtualenv` environment. You can do this via:\n```bash\n$ sudo pip3 install virtualenv\n$ virtualenv ./ubicoustics -p python3\n$ source ubicoustics/bin/activate\n\n```\n\nOnce `virtualenv` is installed, clone the repo and follow the instructions above to run Ubicoustics.\n\n```bash\n(ubicoustics)$ git clone https://github.com/FIGLAB/ubicoustics.git\n(ubicoustics)$ cd ubicoustics\n(ubicoustics)$ python example_fileprediction_simple.py\n```\n\nOnce running, you'll get warning messages, specifically about exceeding memory. Don't worry— RPi will keep chugging along until `tensorflow` completes loading the model. Once done, your output should look something like this:\n\n```\nDownloading example_model.hdf5 [867MB]:\n100% [......................................................................] 865808944 / 865808944\nUsing deep learning model: models/example_model.hdf5\n2018-12-17 15:09:08.039745: W tensorflow/core/framework/allocator.cc:113] Allocation of 1024 exceeds 10% of system memory.\n2018-12-17 15:09:08.040125: W tensorflow/core/framework/allocator.cc:113] Allocation of 1024 exceeds 10% of system memory.\n2018-12-17 15:09:08.040456: W tensorflow/core/framework/allocator.cc:113] Allocation of 2048 exceeds 10% of system memory.\n2018-12-17 15:09:08.040689: W tensorflow/core/framework/allocator.cc:113] Allocation of 2048 exceeds 10% of system memory.\n2018-12-17 15:09:08.040904: W tensorflow/core/framework/allocator.cc:113] Allocation of 16384 exceeds 10% of system memory.\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Coughing (1.00)\nPrediction: Toilet Flushing (1.00)\nPrediction: Toilet Flushing (1.00)\nPrediction: Toilet Flushing (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Water Running (1.00)\nPrediction: Knocking (1.00)\nPrediction: Knocking (0.99)\nPrediction: Knocking (0.91)\n```\n\nFeel free to try the other examples. It takes a while for RPi to run the script, but once loaded, you'll have a prediction framerate of about 1Hz. If you've made it this far, congratulations! :)\n\n```bash\n(ubicoustics)$ python example_liveprediction_simple.py\n```\n\nThat's it! For help, questions, and general feedback, contact Gierad Laput (gierad.laput@cs.cmu.edu).\n\n## Disclaimer\n\n```\nTHE PROGRAM IS DISTRIBUTED IN THE HOPE THAT IT WILL BE USEFUL, BUT WITHOUT ANY WARRANTY. IT IS PROVIDED \"AS IS\" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.\n\nIN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW THE AUTHOR WILL BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF THE AUTHOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFIGLAB%2Fubicoustics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FFIGLAB%2Fubicoustics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFIGLAB%2Fubicoustics/lists"}