Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/marionchaff/piano-transcription
Turn audio recordings into piano sheets
https://github.com/marionchaff/piano-transcription
audio-files audio-processing deep-learning midi midi-files music piano piano-sheet-music python spectrogram tensorflow
Last synced: about 2 months ago
JSON representation
Turn audio recordings into piano sheets
- Host: GitHub
- URL: https://github.com/marionchaff/piano-transcription
- Owner: MarionChaff
- Created: 2024-08-25T17:04:03.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-08-25T17:06:29.000Z (4 months ago)
- Last Synced: 2024-10-12T04:41:25.663Z (3 months ago)
- Topics: audio-files, audio-processing, deep-learning, midi, midi-files, music, piano, piano-sheet-music, python, spectrogram, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 105 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# piano-transcription
Turn audio recordings into piano sheetsThis repo contains two notebooks including (i) audio data preprocessing and (ii) the training of a model designed to recognize piano notes. It can be used to turn audio recordings into piano sheets.
**Database**
- Very short midi files with one or many notes played simultaneously (midi files contains information such as as musical notes, timing, velocity, pitch, and instrument data)
**X data pre-processing** (cf. file 1-data-preprocessing.ipynb)
- Each midi (.mid) is converted into audio (.wav) file
- Each audio file (.wav) is converted into a CQT spectrogram (.png)
- Each spectrogram (.png) is converted into a grey-scale 1d array and resized to a unique format (60 x 150)**y data pre-processing** (cf. file 1-data-preprocessing.ipynb)
- The notes played are extracted from the midi (.mid) files and are one-hot encoded
**Model training** (cf. file 2-model-training.ipynb)
- Convolutional layers were used to extract spatial features from each spectogram
- Sigmoid activation and binary crossentropy loss functions were used to capture multiple notes played simultaneously
- Model accuracy reaches 0.96 on train, val and test sets