https://github.com/nourine-nadir/speech_processing

This repository explores speech processing techniques like noise cancellation and speech segmentation through Python code.(Speech recognition soon)
https://github.com/nourine-nadir/speech_processing

artificial-intelligence noise-cancellation speech-processing speech-segmentation

Last synced: 19 days ago
JSON representation

This repository explores speech processing techniques like noise cancellation and speech segmentation through Python code.(Speech recognition soon)

Host: GitHub
URL: https://github.com/nourine-nadir/speech_processing
Owner: Nourine-Nadir
Created: 2024-04-20T17:46:29.000Z (about 1 year ago)
Default Branch: master
Last Pushed: 2024-09-09T17:14:12.000Z (8 months ago)
Last Synced: 2024-09-09T21:28:10.456Z (8 months ago)
Topics: artificial-intelligence, noise-cancellation, speech-processing, speech-segmentation
Language: Jupyter Notebook
Homepage:
Size: 8.39 MB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: ReadMe.md

Awesome Lists containing this project

README

## Speech Processing

This repository contains code for various speech processing tasks.

**Subfolders:**

* **speech_preprocessing:** Contains functions for preprocessing speech audio, including:
* Spectrogram calculation
* Noise cancellation (using spectral subtraction)
* **wavs:** Stores audio files used in the examples.

**Files:**

* **Noise_cancelling.py:** Implements functions for calculating spectrograms, noise cancellation, and audio reconstruction.
* **speech_segmentation.py:** Implements a basic speech segmentation engine using energy-based detection.

**Noise Cancellation**

The `Noise_cancelling.py` file provides functions for:

* Spectrogram calculation (`spectrogram` and `spectrogram2wav`) for converting between time and frequency domains.
* Noise cancellation (`NoiseCancelling`) using spectral subtraction to remove noise from an audio signal.

**Speech Segmentation**

The `speech_segmentation.py` file demonstrates a simple speech segmentation engine based on energy levels. The `Engine` class implements methods for:

* Calculating energy of the audio signal.
* Updating a dynamic threshold based on background noise.
* Detecting speech segments based on energy exceeding the threshold for a minimum duration.

**Running the Examples**

This repository requires libraries like `numpy`, `matplotlib.pyplot`, `scipy.io.wavfile`, and `IPython.display` (for audio playback). Make sure you have them installed before running the scripts.

The provided code snippets showcase the functionalities. You can modify them to suit your specific needs.

**Future Work**

This repository serves as a starting point for speech processing tasks. Here are some potential areas for future development:

* Implementing more sophisticated noise cancellation techniques.
* Utilizing feature extraction methods for speaker recognition or speech classification.
* Refining the speech segmentation engine for better accuracy.

Feel free to explore, modify, and contribute to this repository!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nourine-nadir/speech_processing

Awesome Lists containing this project

README