Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nourine-nadir/speech_processing
This repository explores speech processing techniques like noise cancellation and speech segmentation through Python code.(Speech recognition soon)
https://github.com/nourine-nadir/speech_processing
artificial-intelligence noise-cancellation speech-processing speech-segmentation
Last synced: about 2 months ago
JSON representation
This repository explores speech processing techniques like noise cancellation and speech segmentation through Python code.(Speech recognition soon)
- Host: GitHub
- URL: https://github.com/nourine-nadir/speech_processing
- Owner: Nourine-Nadir
- Created: 2024-04-20T17:46:29.000Z (9 months ago)
- Default Branch: master
- Last Pushed: 2024-09-09T17:14:12.000Z (4 months ago)
- Last Synced: 2024-09-09T21:28:10.456Z (4 months ago)
- Topics: artificial-intelligence, noise-cancellation, speech-processing, speech-segmentation
- Language: Jupyter Notebook
- Homepage:
- Size: 8.39 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: ReadMe.md
Awesome Lists containing this project
README
## Speech Processing
This repository contains code for various speech processing tasks.
**Subfolders:**
* **speech_preprocessing:** Contains functions for preprocessing speech audio, including:
* Spectrogram calculation
* Noise cancellation (using spectral subtraction)
* **wavs:** Stores audio files used in the examples.**Files:**
* **Noise_cancelling.py:** Implements functions for calculating spectrograms, noise cancellation, and audio reconstruction.
* **speech_segmentation.py:** Implements a basic speech segmentation engine using energy-based detection.**Noise Cancellation**
The `Noise_cancelling.py` file provides functions for:
* Spectrogram calculation (`spectrogram` and `spectrogram2wav`) for converting between time and frequency domains.
* Noise cancellation (`NoiseCancelling`) using spectral subtraction to remove noise from an audio signal.**Speech Segmentation**
The `speech_segmentation.py` file demonstrates a simple speech segmentation engine based on energy levels. The `Engine` class implements methods for:
* Calculating energy of the audio signal.
* Updating a dynamic threshold based on background noise.
* Detecting speech segments based on energy exceeding the threshold for a minimum duration.**Running the Examples**
This repository requires libraries like `numpy`, `matplotlib.pyplot`, `scipy.io.wavfile`, and `IPython.display` (for audio playback). Make sure you have them installed before running the scripts.
The provided code snippets showcase the functionalities. You can modify them to suit your specific needs.
**Future Work**
This repository serves as a starting point for speech processing tasks. Here are some potential areas for future development:
* Implementing more sophisticated noise cancellation techniques.
* Utilizing feature extraction methods for speaker recognition or speech classification.
* Refining the speech segmentation engine for better accuracy.Feel free to explore, modify, and contribute to this repository!