{"id":18682646,"url":"https://github.com/linto-ai/linto-diarization","last_synced_at":"2025-06-10T10:08:48.210Z","repository":{"id":40752676,"uuid":"445191107","full_name":"linto-ai/linto-diarization","owner":"linto-ai","description":"Speaker diarization service","archived":false,"fork":false,"pushed_at":"2025-04-11T13:58:40.000Z","size":38777,"stargazers_count":21,"open_issues_count":5,"forks_count":1,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-04-12T04:37:30.776Z","etag":null,"topics":["asr","linto","speaker-diarization","speaker-identification"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/linto-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-01-06T14:06:17.000Z","updated_at":"2025-01-30T03:24:24.000Z","dependencies_parsed_at":"2023-02-12T18:45:16.964Z","dependency_job_id":"08530d3f-80e4-4194-a5ae-de097ca6118b","html_url":"https://github.com/linto-ai/linto-diarization","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linto-ai%2Flinto-diarization","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linto-ai%2Flinto-diarization/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linto-ai%2Flinto-diarization/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linto-ai%2Flinto-diarization/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/linto-ai","download_url":"https://codeload.github.com/linto-ai/linto-diarization/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linto-ai%2Flinto-diarization/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259053538,"owners_count":22798438,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asr","linto","speaker-diarization","speaker-identification"],"created_at":"2024-11-07T10:12:28.402Z","updated_at":"2025-06-10T10:08:48.171Z","avatar_url":"https://github.com/linto-ai.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LinTO-diarization\n\nLinTO-diarization is an API for Speaker Diarization (segmenting an audio stream into homogeneous segments according to the speaker identity),\nwith some capabilities for Speaker Identification when audio samples of known speakers are provided.\n\nLinTO-diarization can currently work with several technologies.\nThe following families of technologies are currently supported (please refer to respective documentation for more details):\n* [PyAnnote](pyannote/README.md)\n* [simple_diarizer](simple/README.md)\n* [PyBK](pybk/README.md) (deprecated)\n\nLinTO-diarization can either be used as a standalone transcription service or deployed within a micro-services infrastructure using a message broker connector.\n\n## Quick test\n\nBelow are examples of how to test diarization with \"simple_diarizer\", on Linux OS with docker installed.\n\n\"PyAnnote\" is the recommended diarization method.\nIn what follow, you can replace \"pyannote\" by \"simple\" or \"pybk\" to try other methods.\n\n### HTTP Server\n\n1. If you want to use speaker identification, make sure Qdrant is running.\nFirst, create a custom bridge network so the diarization container can communicate with qdrant :\n\n```bash\ndocker network create diarization_network\n```\n You can start Qdrant using the following Docker command:\n\n```bash\ndocker run \n    --name qdrant \\\n    --network diarization_network \\\n    -p 6333:6333 \\  # Qdrant default port\n    -v ./qdrant_storage:/qdrant/storage:z \\\n    qdrant/qdrant\n```\n\n2. If needed, build docker image \n\n```bash\ndocker build . -t linto-diarization-pyannote:latest -f pyannote/Dockerfile\n```  \n\n3. Launch docker container (and keep it running)\n\nIf you want to enable speaker identification, make sure to mount reference speaker audio samples to `/opt/speaker_samples`.\n\n```bash\ndocker run -it --rm \\\n    --name linto-diarization \\\n    --network diarization_network \\\n    -p 8080:80 \\\n    -v ./data/speakers_samples:/opt/speaker_samples \\ # Reference speaker samples. Enables speaker identification\n    --shm-size=1gb --tmpfs /run/user/0 \\\n    --env SERVICE_MODE=http \\\n    --env QDRANT_HOST=qdrant \\ # Only specify if enabling speaker identification\n    --env QDRANT_PORT=6333 \\ # Only specify if enabling speaker identification\n    --env QDRANT_COLLECTION_NAME=speaker_embeddings \\ # Only specify if enabling speaker identification\n    --env QDRANT_RECREATE_COLLECTION=true \\ # Only specify if enabling speaker identification\n    --env SERVICE_MODE=http \\\n    linto-diarization-pyannote:latest\n```\n\nAlternatively, you can use docker-compose :\n\n```yaml\n\nservices:\n  qdrant:\n    image: qdrant/qdrant\n    container_name: qdrant\n    ports:\n      - \"6333:6333\"  # Qdrant default port\n    volumes:\n      - ./qdrant_storage:/qdrant/storage:z\n\n  diarization_app: \n    build: \n      context : .\n      dockerfile: pyannote/Dockerfile\n    container_name: diarization_app\n    shm_size: '1gb'\n    stdin_open: true\n    tty: true     \n    ports :\n      - 8080:80\n    environment:\n      - QDRANT_HOST\n      - QDRANT_PORT\n      - QDRANT_COLLECTION_NAME\n      - QDRANT_RECREATE_COLLECTION\n      - SERVICE_MODE\n      - SERVICE_NAME\n      - SERVICES_BROKER\n      - CONCURRENCY\n    volumes:\n      - ./data/speakers_samples:/opt/speaker_samples # Reference Speaker samples : This enables speaker identification\n    depends_on:\n      - qdrant  # Ensure Qdrant starts before the app\n    deploy:\n      resources:\n        reservations:\n          devices:\n            - driver: nvidia\n              count: 1\n              capabilities: [gpu]\n\n```\n\nRun it using this command :\n```bash\ndocker compose up\n```  \n\n4. Open the swagger in a browser: [http://localhost:8080/docs](http://localhost:8080/docs)\n   Unfold `/diarization` route and click \"Try it out\". Then\n   - Choose a file\n   - Specify either `speaker_count` (Fixed number of speaker) or `max_speaker` (Max number of speakers)\n   - Click `Execute`\n\n### Celery worker\n\nIn the following we assume we want to test on an audio that is in `$HOME/test.wav`\n\n1. If needed, build docker image \n\n```bash\ndocker build . -t linto-diarization-pyannote:latest -f pyannote/Dockerfile\n```\n\n2. If you want to use speaker identification, make sure Qdrant is running. You can start Qdrant using the following Docker command:\n\n```bash\ndocker run \n    -p 6333:6333 \\  # Qdrant default port\n    -v ./qdrant_storage:/qdrant/storage:z \\\n    qdrant/qdrant\n```\n\n3. Run Redis server\n\n```bash\ndocker run -it --rm \\\n    -p 6379:6379 \\\n    redis/redis-stack-server:latest \\\n    redis-server /etc/redis-stack.conf --protected-mode no --bind 0.0.0.0 --loglevel debug\n```\n\n4. Launch docker container, attaching the volume where is the audio file on which you will test\n\n```bash\ndocker run -it --rm \\\n    -v $HOME:$HOME \\\n    --env SERVICE_MODE=task \\\n    --env SERVICE_NAME=diarization \\\n    --env SERVICES_BROKER=redis://172.17.0.1:6379 \\\n    --env BROKER_PASS= \\\n    --env CONCURRENCY=2 \\\n    --env QDRANT_HOST=localhost \\\n    --env QDRANT_PORT=6333 \\\n    --env QDRANT_COLLECTION_NAME=speaker_embeddings \\\n    --env QDRANT_RECREATE_COLLECTION=true \\\n    linto-diarization-pyannote:latest\n```\n\n5. Testing with a given audio file can be done using python3 (with packages `celery` and `redis` installed).\n   For example with the following command for the file `$HOME/test.wav` with 2 speakers\n\n```bash\npip3 install redis celery # if not installed yet\n\npython3 -c \"\\\nimport celery; \\\nimport os; \\\nworker = celery.Celery(broker='redis://localhost:6379/0', backend='redis://localhost:6379/1'); \\\nprint(worker.send_task('diarization_task', (os.environ['HOME']+'/test.wav', 2, None), queue='diarization').get());\\\n\"\n```\n\n## License\nThis project is developped under the AGPLv3 License (see LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinto-ai%2Flinto-diarization","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flinto-ai%2Flinto-diarization","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinto-ai%2Flinto-diarization/lists"}