{"id":17093538,"url":"https://github.com/nuniz/speech-audio-ml-interview","last_synced_at":"2026-01-22T06:39:18.063Z","repository":{"id":227548034,"uuid":"771713885","full_name":"nuniz/speech-audio-ml-interview","owner":"nuniz","description":"Speech \u0026 Audio Algorithms and Machine Learning Interview Questions","archived":false,"fork":false,"pushed_at":"2024-12-31T08:31:45.000Z","size":317,"stargazers_count":7,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-23T16:43:32.455Z","etag":null,"topics":["algorithms","audio","data-science","interview","interview-preparation","interview-questions","job","machie-learning","questions","sound","speech"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nuniz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-13T20:09:53.000Z","updated_at":"2025-02-18T14:36:03.000Z","dependencies_parsed_at":"2024-03-13T23:24:55.206Z","dependency_job_id":"358a213f-8feb-4903-b900-df889ef0fa1b","html_url":"https://github.com/nuniz/speech-audio-ml-interview","commit_stats":null,"previous_names":["nuniz/speech-audio-ml-interview"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/nuniz/speech-audio-ml-interview","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nuniz%2Fspeech-audio-ml-interview","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nuniz%2Fspeech-audio-ml-interview/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nuniz%2Fspeech-audio-ml-interview/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nuniz%2Fspeech-audio-ml-interview/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nuniz","download_url":"https://codeload.github.com/nuniz/speech-audio-ml-interview/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nuniz%2Fspeech-audio-ml-interview/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28657070,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-22T01:17:37.254Z","status":"online","status_checked_at":"2026-01-22T02:00:07.137Z","response_time":144,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithms","audio","data-science","interview","interview-preparation","interview-questions","job","machie-learning","questions","sound","speech"],"created_at":"2024-10-14T14:07:28.108Z","updated_at":"2026-01-22T06:39:18.046Z","avatar_url":"https://github.com/nuniz.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Speech \u0026 Audio Interview Questions\nFeel free to dive into any section that interests you :-)\n\n![Diagram](images/diagram-v1.png)\n\n\u003cdetails\u003e\n\n\u003csummary\u003eTable of contents\u003c/summary\u003e\n\n- [Speech \\\u0026 Audio Algorithms and Machine Learning](#speech--audio-algorithms-and-machine-learning)\n- [Table of contents](#table-of-contents)\n- [Acoustics ](#acoustics-)\n  - [Sound ](#sound-)\n  - [Reverberation ](#reverberation-)\n- [Electronics ](#electronics-)\n- [Signal Processing ](#signal-processing-)\n  - [Digital Filtering ](#digital-filtering-)\n  - [Audio Features ](#audio-features-)\n  - [Audio Transforms ](#audio-transforms-)\n  - [Compression ](#compression-)\n  - [Noise Reduction ](#noise-reduction-)\n- [Deep Learning ](#deep-learning-)\n  - [Sound Classification ](#sound-classification-)\n  - [Speech Enhancement ](#speech-enhancement-)\n  - [Speaker Recognition ](#speaker-recognition-)\n  - [Speech Recognition ](#speech-recognition-)\n\u003c/details\u003e\n\n# Acoustics \u003ca name=\"acoustics\"\u003e\u003c/a\u003e\n\n## Sound \u003ca name=\"sound\"\u003e\u003c/a\u003e\n\n* What is the difference between sound power and sound intensity?\n* How do we convert sound pressure between dB SPL and pascals (Pa)?\n* What’s the difference between dB SPL and dB(A)?\n* How do density and elasticity of a medium affect sound speed?\n* What is the Doppler effect, and how does it work?\n\n## Reverberation \u003ca name=\"reverb\"\u003e\u003c/a\u003e\n\n* What is room impulse response (RIR), and how is it measured?\n* What are the effects of reverberation in room acoustics?\n* How is reverberation measured (RT60)?\n* How can we simulate reverberation digitally?\n* What methods are used to analyze time delay in audio signals?\n\n# Electronics \u003ca name=\"electronics\"\u003e\u003c/a\u003e\n\n* What should you consider when choosing a microphone?\n* How do you calibrate a microphone?\n* What is an Anti-Aliasing filter?\n* What are typical sampling rates and bit ranges for audio?\n* What are the common interfaces used in digital audio systems?\n\n# Signal Processing \u003ca name=\"signal_processing\"\u003e\u003c/a\u003e\n\n## Digital Filtering \u003ca name=\"digital_filter\"\u003e\u003c/a\u003e\n\n* How do FIR and IIR filters differ?\n* What does the filtfilt function do?\n* How does a preamplifier work in a microphone setup?\n* How is zero-phase filtering done, and what are its benefits?\n* How can we test the stability of digital filters?\n\n## Audio Features \u003ca name=\"features\"\u003e\u003c/a\u003e\n\n* What is signal energy, and how do we calculate it?\n* What are the uses of ZCR and FFT in audio analysis?\n* How can we estimate the pitch of speech?\n* What are common audio features, and how do we extract them?\n* How can we test the similarity between two audio signals?\n\n## Audio Transforms \u003ca name=\"audio_transforms\"\u003e\u003c/a\u003e\n\n* What is STFT, and how is it done?\n* What are the key considerations when implementing STFT?\n* What is MFCC used for in audio processing?\n\n## Compression \u003ca name=\"compression\"\u003e\u003c/a\u003e\n\n* How does the number of quantizer levels change the dynamic range?\n* How does AD-PCM work?\n* What is LPC, and how does it represent speech?\n* How is mu-law quantization different from linear quantization?\n\n## Noise Reduction \u003ca name=\"noise_reduction\"\u003e\u003c/a\u003e\n\n* How does spectral subtraction work?\n* What is the Wiener filtering method?\n* When is wavelet-based denoising useful?\n* What is Speech Presence Probability (SPP), and how is it used?\n* How is adaptive filtering used for noise reduction and echo cancellation?\n\n# Deep Learning \u003ca name=\"deep_learning\"\u003e\u003c/a\u003e\n\n## Sound Classification \u003ca name=\"classification\"\u003e\u003c/a\u003e\n\n* What are the challenges in sound classification?\n* How is deep learning used in sound classification?\n* What metrics evaluate classification models?\n\n## Speech Enhancement \u003ca name=\"enhancement\"\u003e\u003c/a\u003e\n\n* What deep networks are common for speech enhancement?\n* How is phase handled in speech enhancement?\n* Why might MSE not be the best loss function?\n* What metrics evaluate speech enhancement models?\n\n## Speaker Recognition \u003ca name=\"speaker\"\u003e\u003c/a\u003e\n\n* What’s the difference between diarization, identification, and verification?\n* What networks are used for speaker recognition?\n* What are speaker embeddings, and how are they used?\n* How are x-vectors different from i-vectors?\n\n## Speech Recognition \u003ca name=\"recognition\"\u003e\u003c/a\u003e\n\n* What methods are used for speech recognition?\n* How is audio prepared for speech recognition?\n* How are speech recognition models evaluated?\n* How does Whisper use weak supervision?\n* What is the Whisper model architecture?\n* What are the key features and differences between Wav2Vec models?\n* How does CTC encoding help Wav2Vec?\n* What’s the role of Beam Search in Wav2Vec?\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnuniz%2Fspeech-audio-ml-interview","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnuniz%2Fspeech-audio-ml-interview","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnuniz%2Fspeech-audio-ml-interview/lists"}