https://github.com/qiuqiangkong/panns_inference

Last synced: 6 months ago
JSON representation

Host: GitHub
URL: https://github.com/qiuqiangkong/panns_inference
Owner: qiuqiangkong
License: mit
Created: 2020-03-08T06:22:30.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2024-03-05T18:07:23.000Z (over 2 years ago)
Last Synced: 2025-08-31T19:40:45.333Z (11 months ago)
Language: Python
Size: 460 KB
Stars: 236
Watchers: 4
Forks: 33
Open Issues: 15
Metadata Files:
- Readme: README.md
- License: LICENSE.MIT

Awesome Lists containing this project

awesome-python-audio - panns-inference

README

          # PANNs inferece

**panns_inference** provides an easy to use Python interface for audio tagging and sound event detection. The audio tagging and sound event detection models are trained from PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition: https://github.com/qiuqiangkong/audioset_tagging_cnn

## Installation

PyTorch>=1.0 is required.

```

$ pip install panns-inference

```

## Usage

```

$ python3 example.py

```

For example:

```

import librosa

import panns_inference

from panns_inference import AudioTagging, SoundEventDetection, labels

audio_path = 'examples/R9_ZSCveAHg_7s.wav'

(audio, _) = librosa.core.load(audio_path, sr=32000, mono=True)

audio = audio[None, :]  # (batch_size, segment_samples)

print('------ Audio tagging ------')

at = AudioTagging(checkpoint_path=None, device='cuda')

(clipwise_output, embedding) = at.inference(audio)

print('------ Sound event detection ------')

sed = SoundEventDetection(checkpoint_path=None, device='cuda')

framewise_output = sed.inference(audio)

```

## Results


------ Audio tagging ------

Checkpoint path: /root/panns_data/Cnn14_mAP=0.431.pth

GPU number: 1

Speech: 0.893

Telephone bell ringing: 0.754

Inside, small room: 0.235

Telephone: 0.183

Music: 0.092

Ringtone: 0.047

Inside, large room or hall: 0.028

Alarm: 0.014

Animal: 0.009

Vehicle: 0.008

------ Sound event detection ------

Checkpoint path: /root/panns_data/Cnn14_mAP=0.431.pth

GPU number: 1

Save fig to results/sed_result.pdf



Sound event detection plot:



## Cite

[1] Kong, Qiuqiang, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. "PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition." arXiv preprint arXiv:1912.10211 (2019).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/qiuqiangkong/panns_inference

Awesome Lists containing this project

README