https://github.com/hayatiyrtgl/audio_processing_for_cnn_network
Spectrum creation is the most important thing while dealing with audio data
https://github.com/hayatiyrtgl/audio_processing_for_cnn_network
audio audio-processing librosa preprocessing preprocessing-data python stft
Last synced: about 1 month ago
JSON representation
Spectrum creation is the most important thing while dealing with audio data
- Host: GitHub
- URL: https://github.com/hayatiyrtgl/audio_processing_for_cnn_network
- Owner: HayatiYrtgl
- License: mit
- Created: 2024-04-19T13:21:17.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-19T13:22:20.000Z (about 2 years ago)
- Last Synced: 2025-02-14T14:51:40.496Z (over 1 year ago)
- Topics: audio, audio-processing, librosa, preprocessing, preprocessing-data, python, stft
- Language: Jupyter Notebook
- Homepage:
- Size: 581 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Audio data processing and exploratory data analysis (EDA) using the librosa library in Python. Let's break it down step by step:
1. **Importing Libraries:**
```python
import pandas as pd
import numpy as np
import librosa
import matplotlib.pyplot as plt
import seaborn as sns
import glob
```
This section imports necessary libraries for data manipulation, audio processing, visualization, and file path handling.
2. **Data Reading and EDA:**
```python
train_path = glob.glob("../DATASET/archive/audio_data/train/*/*")
```
This line uses the `glob` module to retrieve file paths matching a specific pattern. It stores the paths of audio files in the training dataset.
```python
an_audio_file = train_path[0]
librosa_audio, librosa_sample_rate = librosa.load(an_audio_file)
```
These lines load the first audio file from the training dataset using librosa's `load()` function and store the audio data and sample rate in variables.
```python
pd.Series(librosa_audio).plot(title="Raw Data")
```
This line plots the raw audio data using Pandas Series plot.
```python
an_audio_file_trimmed = librosa.effects.trim(librosa_audio, top_db=30)
```
This line trims the silent parts of the audio file using librosa's `effects.trim()` function.
```python
D = librosa.stft(an_audio_file_trimmed[0])
S_db = librosa.amplitude_to_db(np.abs(D), ref=np.max)
```
These lines compute the Short-Time Fourier Transform (STFT) and convert it to decibels.
```python
melspec = librosa.feature.melspectrogram(y=librosa_audio, sr=librosa_sample_rate, n_mels=40)
```
This line computes the Mel spectrogram of the audio data.
```python
librosa.display.specshow(melspec)
```
This line displays the Mel spectrogram using librosa's `display.specshow()` function.
```python
librosa.display.specshow(S_db.T)
```
This line displays the STFT in decibels. The `T` attribute transposes the matrix to fit the plot.
This code snippet gives an overview of how to load, preprocess, and visualize audio data using the librosa library in Python. If you have any specific questions or need further explanation on any part, feel free to ask!