{"id":49534252,"url":"https://github.com/ladbaby/insrec","last_synced_at":"2026-05-02T09:05:10.274Z","repository":{"id":297797843,"uuid":"993117016","full_name":"Ladbaby/InsRec","owner":"Ladbaby","description":"🎹 A Musical Instrument Recognition App Using Neural Networks.","archived":false,"fork":false,"pushed_at":"2025-06-07T14:06:19.000Z","size":309,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-07T15:19:28.066Z","etag":null,"topics":["audio-classification","deep-learning","time-series","time-series-classification"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Ladbaby.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-30T08:43:05.000Z","updated_at":"2025-06-07T14:06:23.000Z","dependencies_parsed_at":"2025-06-07T15:29:56.755Z","dependency_job_id":null,"html_url":"https://github.com/Ladbaby/InsRec","commit_stats":null,"previous_names":["ladbaby/insrec"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Ladbaby/InsRec","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ladbaby%2FInsRec","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ladbaby%2FInsRec/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ladbaby%2FInsRec/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ladbaby%2FInsRec/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Ladbaby","download_url":"https://codeload.github.com/Ladbaby/InsRec/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ladbaby%2FInsRec/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32528665,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-02T01:12:54.858Z","status":"online","status_checked_at":"2026-05-02T02:00:05.923Z","response_time":132,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio-classification","deep-learning","time-series","time-series-classification"],"created_at":"2026-05-02T09:04:52.515Z","updated_at":"2026-05-02T09:05:10.261Z","avatar_url":"https://github.com/Ladbaby.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🎹 InsRec: Musical Instrument Recognition App\n\nUsing state-of-the-art 📈 time series analysis neural networks for musical instrument recognition!\n\n🚀 Powered by [PyOmniTS](https://github.com/Ladbaby/PyOmniTS), the unified framework for time series analysis.\n\n\u003e [!IMPORTANT]\n\u003e Accuracy is not guaranteed (and I'm not an expert in music)! Refer to the benchmark section for model performance details.\n\n## 📷 Screenshot\n\n![](images/screenshot_MIC.png)\n\n## 🌟 Features\n\nModels are currently trained on the [OpenMIC-2018 dataset](https://zenodo.org/records/1432913), which includes 20 types of \"instruments\":\n\n0. 🪗 Accordion [[wiki]](https://en.wikipedia.org/wiki/Accordion)\n1. 🪕 Banjo [[wiki]](https://en.wikipedia.org/wiki/Banjo)\n2. Bass [[wiki]](https://en.wikipedia.org/wiki/Bass_(sound))\n3. Cello [[wiki]](https://en.wikipedia.org/wiki/Cello)\n4. Clarinet [[wiki]](https://en.wikipedia.org/wiki/Clarinet)\n5. Cymbals [[wiki]](https://en.wikipedia.org/wiki/Cymbals)\n6. 🥁 Drums [[wiki]](https://en.wikipedia.org/wiki/Drum)\n7. Flute [[wiki]](https://en.wikipedia.org/wiki/Flute)\n8. 🎸 Guitar [[wiki]](https://en.wikipedia.org/wiki/Guitar)\n9. Mallet Percussion [[wiki]](https://en.wikipedia.org/wiki/Keyboard_percussion_instrument)\n10. Mandolin [[wiki]](https://en.wikipedia.org/wiki/Mandolin)\n11. Organ [[wiki]](https://en.wikipedia.org/wiki/Organ_(music))\n12. 🎹 Piano [[wiki]](https://en.wikipedia.org/wiki/Piano)\n13. 🎷 Saxophone [[wiki]](https://en.wikipedia.org/wiki/Saxophone)\n14. Synthesizer [[wiki]](https://en.wikipedia.org/wiki/Synthesizer)\n15. Trombone [[wiki]](https://en.wikipedia.org/wiki/Trombone)\n16. 🎺 Trumpet [[wiki]](https://en.wikipedia.org/wiki/Trumpet)\n17. Ukulele [[wiki]](https://en.wikipedia.org/wiki/Ukulele)\n18. 🎻 Violin [[wiki]](https://en.wikipedia.org/wiki/Violin)\n19. 🗣️ Voice [[wiki]](https://en.wikipedia.org/wiki/Human_voice)\n\n## ⏬ Installation\n\n### From Source\n\n1. Clone this repository and its submodules, then checkout to branch `InsRec` for backend submodule.jh\n\n    ```shell\n    git clone --recurse-submodules https://github.com/Ladbaby/InsRec\n    cd InsRec/backend\n    git checkout InsRec\n    cd ..\n    ```\n\n2. Create a python virtual environment via the tool of your choice.\n\n    for example, using [Miniconda](https://docs.conda.io/en/latest/miniconda.html)/[Anaconda](https://www.anaconda.com/):\n\n    ```shell\n    conda create -n InsRec python=3.12\n    conda activate InsRec\n    ```\n\n    \u003e Python 3.11 \u0026 3.12 have been tested. Other versions may also work.\n\n3. Install dependencies in the created environment.\n\n    ```shell\n    pip install -r backend/requirements.txt\n    pip install -r requirements.txt\n    ```\n\n    \u003e Some models may require extra dependencies, which can be found in comments of `backend/requirements.txt`.\n\n## 🚀 Usage\n\n### Easy: Use Existing Model Weights\n\nThe web UI is launched via:\n\n```shell\nstreamlit run main.py\n```\n\nor running `sh main.sh`.\n\nDuring the first run, it will prompt you whether to download checkpoint files for models in the terminal.\n\n### Advanced: Train a Model\n\nNeural network training is powered by [PyOmniTS](https://github.com/Ladbaby/PyOmniTS) framework.\n\nThe training procedure for existing models on OpenMIC-2018 dataset is detailed here.\n\n#### Obtain OpenMIC Dataset\n\n- Download the dataset from [here](https://zenodo.org/records/1432913), and place the extracted result under `backend/storage/datasets/OpenMIC`.\nCreate the parent folder if not exists.\n- Download the processed VGGish representations of corresponding audios from [huggingface](https://huggingface.co/datasets/Ladbaby/InsRec-datasets/blob/main/OpenMIC/processed/x_repr_times.npy), and place it under `backend/storage/datasets/OpenMIC/processed`.\n\n    \u003e It's worth noting that these VGGish representations are different from the \"X\" in `backend/storage/datasets/OpenMIC/openmic-2018.npz`. Our representations are obtained using the pretrained [PyTorch VGGish pipeline](https://docs.pytorch.org/audio/master/generated/torchaudio.prototype.pipelines.VGGISH.html) and the PCA weights from [torchvggish](https://github.com/harritaylor/torchvggish/releases/download/v0.1/vggish_pca_params-970ea276.pth).\n\n#### Train the Model\n\nYou may find experimental settings (e.g., learning rate, d_model) for the chosen model in its scripts under `backend/scripts/CHOSEN_MODEL/OpenMIC.sh`.\n\nStart training by:\n\n```shell\ncd backend\nsh scripts/CHOSEN_MODEL/OpenMIC.sh\n```\n\nModel weights `pytorch_model.bin` will be found under `backend/storage/results`\n\nTo infer using your trained weights instead, replace the `pytorch_model.bin` file under `backend/storage/pretrained/OpenMIC/CHOSEN_MODEL` folder with your own.\n\n## 📊 Model Performance Benchmark\n\nTest set performance on OpenMIC-2018 dataset:\n\n|Model|Accuracy|Precision|Recall|F1\n|---|---|---|---|---|\n|Pyraformer|67.71|64.74|64.95|64.18\n|Reformer|67.50|64.45|64.30|63.65\n|Informer|66.98|63.41|63.74|62.73\n|Nonstationary Transformer|66.46|63.88|63.06|62.66\n|Hi-Patch|65.90|63.72|60.76|61.12\n|GRU-D|65.83|63.30|62.34|61.95\n|TimesNet|65.52|62.74|61.58|61.41\n|Mamba|65.26|61.97|61.25|60.85\n|TSMixer|65.21|62.35|60.59|60.64\n|Raindrop|65.16|62.30|62.33|61.29\n|LightTS|65.05|63.32|60.34|60.95\n|Transformer|65.00|63.12|63.71|61.87\n|FEDformer|64.48|60.58|59.96|59.61\n|FreTS|64.48|62.12|59.30|59.99\n|DLinear|64.22|62.04|59.04|59.64\n|Linear|64.17|63.07|58.41|59.60\n|Leddam|63.18|59.77|59.97|58.52\n|iTransformer|63.18|60.99|57.83|58.44\n|PrimeNet|62.19|57.97|57.53|56.76\n|mTAN|60.89|53.87|44.75|46.73\n|SegRNN|58.96|61.84|50.58|53.43\n|Autoformer|54.43|52.15|50.25|50.56\n|PatchTST|42.97|43.57|37.50|37.59\n|MICN|36.61|33.68|29.31|29.54\n|SeFT|35.99|28.39|25.02|24.91\n|TiDE|34.58|30.69|30.42|30.21\n|Crossformer|21.72|1.09|5.00|1.78\n|FiLM|21.72|1.09|5.00|1.78\n\n\nExisting state-of-the-art time series models mainly learns in the time domain, while audios processing models primarily learns in the frequency domain. \nAlso, audio (e.g., 16k every second) is far longer than any time series in research datasets (e.g., 720).\nTherefore, [VGGish](https://docs.pytorch.org/audio/master/generated/torchaudio.prototype.pipelines.VGGISH.html) is currently used as the encoder to convert audio input as embeddings, and time series models take them as input instead (it makes little sense I know, but this is possibly the only way for painless adaptation).\n\nFurther improvements may require changing network architecture of time series models, such that VGGish embeddings are treated as representations instead of time series.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fladbaby%2Finsrec","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fladbaby%2Finsrec","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fladbaby%2Finsrec/lists"}