Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yoyolicoris/mir_hw2
https://github.com/yoyolicoris/mir_hw2
autocorrelation beats dynamic-programming tempo-estimation tempogram
Last synced: 17 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/yoyolicoris/mir_hw2
- Owner: yoyolicoris
- Created: 2018-05-15T13:18:20.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2018-06-06T07:35:12.000Z (over 6 years ago)
- Last Synced: 2024-10-25T21:57:00.890Z (19 days ago)
- Topics: autocorrelation, beats, dynamic-programming, tempo-estimation, tempogram
- Language: Python
- Size: 233 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Homework 2 for Music Information Retrieval
## Environment
* ubuntu 16.04 LTS
* python3.5.2 (using Pycharm 2018.1.4)
* extra modules: numpy, scipy, matplotlib, prettytable, librosa## Dataset
All the experiments were done on [Ballroom dataset](http://mtg.upf.edu/ismir2004/contest/tempoContest/node5.html),
consist of 30s long ballroom dancing music pieces.> How to use ?
1. Download the [raw audio files](http://www.iua.upf.edu/mtg/ismir2004/contest/tempoContest/data1.tar.gz).
2. Download the [tempo annotations](http://www.iua.upf.edu/mtg/ismir2004/contest/tempoContest/data2.tar.gz) of each pieces.
3. Donwload the [beat annotations](https://github.com/CPJKU/BallroomAnnotations).
4. Modified the directory variables in [utils.py](utils.py) to where you unzip the data.
```
data_dir = '/where/you/put/audio/files'
bpm_label_dir = '/where/you/put/bpm/annotations'
beat_label_dir = /where/you/put/beat/annotations'
```
## Usage of each file### Q1~3: Tempo estimation using Fourier tempogram
In this task, we use Fourier tempogram, which is the short-time Fourier transform of the [spetral flux](https://en.wikipedia.org/wiki/Spectral_flux) novelty curve of the music,
to perform tempo estimation.$ python3 Q1-3.py
The program will output its P-score and ALOTC score (at least one tempo correct) on eight different genres.
+---------------+---------+--------+--------------+--------------+--------------+
| Genre | P-score | ALOTC | 1/2T P-score | 1/3T P-score | 1/4T P-score |
+---------------+---------+--------+--------------+--------------+--------------+
| ChaChaCha | 0.3285 | 0.7387 | 0.4968 | 0.0085 | 0.1571 |
| Jive | 0.4543 | 0.8333 | 0.1583 | 0.0437 | 0.0131 |
| Quickstep | 0.4785 | 0.9146 | 0.0085 | 0.0000 | 0.0000 |
| Rumba | 0.2185 | 0.4898 | 0.5141 | 0.0083 | 0.2032 |
| Samba | 0.1332 | 0.2907 | 0.4468 | 0.0000 | 0.3912 |
| Tango | 0.4958 | 0.9535 | 0.3841 | 0.0000 | 0.0000 |
| VienneseWaltz | 0.5323 | 0.9692 | 0.0854 | 0.0000 | 0.0000 |
| Waltz | 0.4057 | 0.8000 | 0.3560 | 0.0283 | 0.0448 |
+---------------+---------+--------+--------------+--------------+--------------+### Q4~5: Tempo estimation using autocorrelation tempogram
In this task, we use [autocorrelation](https://en.wikipedia.org/wiki/Autocorrelation) tempogram to perform tempo estimation.
$ python3 Q4-5.py
+---------------+---------+--------+--------------+--------------+--------------+------------+------------+------------+
| Genre | P-score | ALOTC | 1/2T P-score | 1/3T P-score | 1/4T P-score | 2T P-score | 3T P-score | 4T P-score |
+---------------+---------+--------+--------------+--------------+--------------+------------+------------+------------+
| ChaChaCha | 0.5064 | 0.9820 | 0.3512 | 0.0000 | 0.0000 | 0.1333 | 0.0000 | 0.0000 |
| Jive | 0.4544 | 0.9333 | 0.0000 | 0.0077 | 0.0000 | 0.5022 | 0.0000 | 0.0204 |
| Quickstep | 0.4331 | 0.8780 | 0.0000 | 0.0000 | 0.0000 | 0.4717 | 0.0114 | 0.0241 |
| Rumba | 0.4533 | 0.9184 | 0.4554 | 0.0000 | 0.0048 | 0.0099 | 0.0000 | 0.0000 |
| Samba | 0.3885 | 0.7674 | 0.4262 | 0.0000 | 0.0400 | 0.0210 | 0.0000 | 0.0000 |
| Tango | 0.5144 | 0.9535 | 0.2381 | 0.0000 | 0.0000 | 0.2021 | 0.0000 | 0.0000 |
| VienneseWaltz | 0.5071 | 0.9077 | 0.0445 | 0.0000 | 0.0000 | 0.2104 | 0.1377 | 0.0000 |
| Waltz | 0.2908 | 0.5364 | 0.3277 | 0.0313 | 0.0609 | 0.0000 | 0.0037 | 0.0000 |
+---------------+---------+--------+--------------+--------------+--------------+------------+------------+------------+
### Q6: Tempo estimation by combine frequency and periodicityThis task the teacher ask us to find de wae to improve and outperform the above method.
> How to improve?
Histogram of the most two probable tempo using Fourier tempogram:
![](images/q1_quickstep.png)Histogram of the most two probable tempo using autocorrelation tempogram:
![](images/q4_quickstep.png)As you can see, Fourier tempogram is easily to produce tempo which is multiple of the true tempo;
on the other hand, autocorrelation tempogram is easily to produce tempo which is one half of the true tempo.
So I decide to combine the two tempograms together, let each one supress others unwanted value and reserved the most probable one.> How to do?
And intuitive way to do it is mapping tempograms to the same domain, then multiply together.
The implementation detailed is similar to this [paper](https://dl.acm.org/citation.cfm?id=2824149).$ python3 Q6.py
+---------------+---------+--------+
| Genre | P-score | ALOTC |
+---------------+---------+--------+
| ChaChaCha | 0.4962 | 0.9910 |
| Jive | 0.5229 | 0.9667 |
| Quickstep | 0.3957 | 0.8293 |
| Rumba | 0.4061 | 0.8776 |
| Samba | 0.3617 | 0.7326 |
| Tango | 0.6069 | 0.9884 |
| VienneseWaltz | 0.4444 | 0.8000 |
| Waltz | 0.4546 | 0.8091 |
+---------------+---------+--------+
The result shows that some genres have been improved, but some are not.### Q7: Beat tracking using dynamic programming
In this task, we also use the same dataset to perform beat tracking. The algorithm we used is describe [here](https://www.ee.columbia.edu/~dpwe/pubs/Ellis07-beattrack.pdf).
I used CFP method to compute the tempo that can be used by the algorithm.The program will output the precision, recall and f-score evaluate with tolerance of +-70 ms on each genre.
$ python3 Q7.py
+---------------+-----------+--------+----------+
| Genre | Precision | Recall | F-scores |
+---------------+-----------+--------+----------+
| ChaChaCha | 0.6381 | 0.9792 | 0.7727 |
| Jive | 0.9386 | 0.8072 | 0.8680 |
| Quickstep | 0.9262 | 0.6444 | 0.7600 |
| Rumba | 0.5558 | 0.9378 | 0.6979 |
| Samba | 0.4332 | 0.8356 | 0.5706 |
| Tango | 0.8807 | 0.9154 | 0.8977 |
| VienneseWaltz | 0.9132 | 0.6732 | 0.7750 |
| Waltz | 0.5526 | 0.8018 | 0.6543 |
+---------------+-----------+--------+----------+
### Down beat trackingThis is a bonus question, and I just do some trial and error to see what will happen. I used the beats in Q7 to construct
a bidirectional spectral flux novelty curve in beat level, and use the same algorithm in Q7 to find the path of downbeat
with fix period of 4 samples (which means I assume the beats are isometric and have time signature of 4/4).The result is apparently not good, but it's fun to try using traditional technique instead of fancy machine learning method.
$ python3 downbeat.py
+---------------+-----------+--------+----------+
| Genre | Precision | Recall | F-scores |
+---------------+-----------+--------+----------+
| ChaChaCha | 0.1204 | 0.1680 | 0.1403 |
| Jive | 0.0878 | 0.0679 | 0.0765 |
| Quickstep | 0.0823 | 0.0515 | 0.0634 |
| Rumba | 0.0673 | 0.1037 | 0.0816 |
| Samba | 0.1794 | 0.3167 | 0.2291 |
| Tango | 0.5912 | 0.5463 | 0.5679 |
| VienneseWaltz | 0.3285 | 0.1639 | 0.2187 |
| Waltz | 0.2155 | 0.2076 | 0.2114 |
+---------------+-----------+--------+----------+