https://github.com/qiuqiangkong/panns_inference
https://github.com/qiuqiangkong/panns_inference
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/qiuqiangkong/panns_inference
- Owner: qiuqiangkong
- License: mit
- Created: 2020-03-08T06:22:30.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2024-03-05T18:07:23.000Z (over 2 years ago)
- Last Synced: 2025-08-31T19:40:45.333Z (10 months ago)
- Language: Python
- Size: 460 KB
- Stars: 236
- Watchers: 4
- Forks: 33
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- License: LICENSE.MIT
Awesome Lists containing this project
- awesome-python-audio - panns-inference
README
# PANNs inferece
**panns_inference** provides an easy to use Python interface for audio tagging and sound event detection. The audio tagging and sound event detection models are trained from PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition: https://github.com/qiuqiangkong/audioset_tagging_cnn
## Installation
PyTorch>=1.0 is required.
```
$ pip install panns-inference
```
## Usage
```
$ python3 example.py
```
For example:
```
import librosa
import panns_inference
from panns_inference import AudioTagging, SoundEventDetection, labels
audio_path = 'examples/R9_ZSCveAHg_7s.wav'
(audio, _) = librosa.core.load(audio_path, sr=32000, mono=True)
audio = audio[None, :] # (batch_size, segment_samples)
print('------ Audio tagging ------')
at = AudioTagging(checkpoint_path=None, device='cuda')
(clipwise_output, embedding) = at.inference(audio)
print('------ Sound event detection ------')
sed = SoundEventDetection(checkpoint_path=None, device='cuda')
framewise_output = sed.inference(audio)
```
## Results
------ Audio tagging ------
Checkpoint path: /root/panns_data/Cnn14_mAP=0.431.pth
GPU number: 1
Speech: 0.893
Telephone bell ringing: 0.754
Inside, small room: 0.235
Telephone: 0.183
Music: 0.092
Ringtone: 0.047
Inside, large room or hall: 0.028
Alarm: 0.014
Animal: 0.009
Vehicle: 0.008
------ Sound event detection ------
Checkpoint path: /root/panns_data/Cnn14_mAP=0.431.pth
GPU number: 1
Save fig to results/sed_result.pdf
Sound event detection plot:

## Cite
[1] Kong, Qiuqiang, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. "PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition." arXiv preprint arXiv:1912.10211 (2019).