https://github.com/nuniz/speech-audio-ml-interview
Speech & Audio Algorithms and Machine Learning Interview Questions
https://github.com/nuniz/speech-audio-ml-interview
algorithms audio data-science interview interview-preparation interview-questions job machie-learning questions sound speech
Last synced: 5 months ago
JSON representation
Speech & Audio Algorithms and Machine Learning Interview Questions
- Host: GitHub
- URL: https://github.com/nuniz/speech-audio-ml-interview
- Owner: nuniz
- License: mit
- Created: 2024-03-13T20:09:53.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-12-31T08:31:45.000Z (over 1 year ago)
- Last Synced: 2025-03-23T16:43:32.455Z (about 1 year ago)
- Topics: algorithms, audio, data-science, interview, interview-preparation, interview-questions, job, machie-learning, questions, sound, speech
- Homepage:
- Size: 310 KB
- Stars: 7
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Speech & Audio Interview Questions
Feel free to dive into any section that interests you :-)

Table of contents
- [Speech \& Audio Algorithms and Machine Learning](#speech--audio-algorithms-and-machine-learning)
- [Table of contents](#table-of-contents)
- [Acoustics ](#acoustics-)
- [Sound ](#sound-)
- [Reverberation ](#reverberation-)
- [Electronics ](#electronics-)
- [Signal Processing ](#signal-processing-)
- [Digital Filtering ](#digital-filtering-)
- [Audio Features ](#audio-features-)
- [Audio Transforms ](#audio-transforms-)
- [Compression ](#compression-)
- [Noise Reduction ](#noise-reduction-)
- [Deep Learning ](#deep-learning-)
- [Sound Classification ](#sound-classification-)
- [Speech Enhancement ](#speech-enhancement-)
- [Speaker Recognition ](#speaker-recognition-)
- [Speech Recognition ](#speech-recognition-)
* What is the difference between sound power and sound intensity?
* How do we convert sound pressure between dB SPL and pascals (Pa)?
* What’s the difference between dB SPL and dB(A)?
* How do density and elasticity of a medium affect sound speed?
* What is the Doppler effect, and how does it work?
* What is room impulse response (RIR), and how is it measured?
* What are the effects of reverberation in room acoustics?
* How is reverberation measured (RT60)?
* How can we simulate reverberation digitally?
* What methods are used to analyze time delay in audio signals?
* What should you consider when choosing a microphone?
* How do you calibrate a microphone?
* What is an Anti-Aliasing filter?
* What are typical sampling rates and bit ranges for audio?
* What are the common interfaces used in digital audio systems?
* How do FIR and IIR filters differ?
* What does the filtfilt function do?
* How does a preamplifier work in a microphone setup?
* How is zero-phase filtering done, and what are its benefits?
* How can we test the stability of digital filters?
* What is signal energy, and how do we calculate it?
* What are the uses of ZCR and FFT in audio analysis?
* How can we estimate the pitch of speech?
* What are common audio features, and how do we extract them?
* How can we test the similarity between two audio signals?
* What is STFT, and how is it done?
* What are the key considerations when implementing STFT?
* What is MFCC used for in audio processing?
* How does the number of quantizer levels change the dynamic range?
* How does AD-PCM work?
* What is LPC, and how does it represent speech?
* How is mu-law quantization different from linear quantization?
* How does spectral subtraction work?
* What is the Wiener filtering method?
* When is wavelet-based denoising useful?
* What is Speech Presence Probability (SPP), and how is it used?
* How is adaptive filtering used for noise reduction and echo cancellation?
* What are the challenges in sound classification?
* How is deep learning used in sound classification?
* What metrics evaluate classification models?
* What deep networks are common for speech enhancement?
* How is phase handled in speech enhancement?
* Why might MSE not be the best loss function?
* What metrics evaluate speech enhancement models?
* What’s the difference between diarization, identification, and verification?
* What networks are used for speaker recognition?
* What are speaker embeddings, and how are they used?
* How are x-vectors different from i-vectors?
* What methods are used for speech recognition?
* How is audio prepared for speech recognition?
* How are speech recognition models evaluated?
* How does Whisper use weak supervision?
* What is the Whisper model architecture?
* What are the key features and differences between Wav2Vec models?
* How does CTC encoding help Wav2Vec?
* What’s the role of Beam Search in Wav2Vec?