https://github.com/bunyaminergen/wavlmmsdd
This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.
https://github.com/bunyaminergen/wavlmmsdd
diarization embedding microsoft nvidia-nemo speaker-diarization speech speech-embedding wavlm
Last synced: 3 months ago
JSON representation
This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.
- Host: GitHub
- URL: https://github.com/bunyaminergen/wavlmmsdd
- Owner: bunyaminergen
- License: gpl-3.0
- Created: 2025-02-14T14:03:51.000Z (3 months ago)
- Default Branch: develop
- Last Pushed: 2025-02-14T18:06:32.000Z (3 months ago)
- Last Synced: 2025-02-14T18:16:33.681Z (3 months ago)
- Topics: diarization, embedding, microsoft, nvidia-nemo, speaker-diarization, speech, speech-embedding, wavlm
- Language: Jupyter Notebook
- Homepage:
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0